views:

39

answers:

2

Hi, i need to do not queries on my lucene index. Lucene currently allows not only when we have two or more terms in the query:

So I can do something like:

country:canada not sweden

but I can't run a query like:

country:not sweden

Could you please let me know if there is some efficient solution for this problem

Thanks

+1  A: 

The short answer is that this is not possible using the standard Lucene.

Lucene does not allow NOT queries as a single term for the same reason it does not allow prefix queries - to perform either, the engine would have to look through each document to ascertain whether the document is/is not a hit. It has to look through each document because it cannot use the search term as the key to look up documents in the inverted index (used to store the indexed documents).

To take your case as an example:

To search for not sweden, the simplest (and possibly most efficient) approach would be to search for sweden and then "invert" the result set to return all documents that are not in that result set. Doing this would require finding all the required (ie. not in the result set) documents in the index, but without a key to look them up by. This would be done by iterating over the documents in the index - a task it is not optimised for, and hence speed would suffer.

If you really need this functionality, you could maintain your own list of items when indexing, so that a not sweden search becomes a sweden search using Lucene, followed by an inversion of the results using your set of items.

adrianbanks
+2  A: 

Please check answer for similar question. The solution is to use MatchAllDocsQuery.

Shashikant Kore
+1 that's the way
Pascal Dimassimo