tags:

views:

74

answers:

2

Hi,

I have a "description" field indexed in Lucene.This field contains a book's description. How do i achieve "All of these words" functionality on this field using BooleanQuery class? For example if a user types in "top selling book" then it should return books which have all of these words in its description.

Thanks!

A: 

I believe if you add all query parts (one per term) via

BooleanQuery.add(Query, BooleanClause.Occur)

and set that second parameter to the constant BooleanClause.Occur.MUST, then you should get what you want. The equivalent query syntax would be "+term1+term2 +term3 ...".

Peter Becker
+1  A: 

There are two pieces to get this to work:

  1. You need the incoming documents to be analysed properly, so that individual words are tokenised and indexed separately
  2. The user query needs to be tokenised, and the tokens combined with the AND operator.

For #1, there are a number of Analyzers and Tokenizers that come with Lucene - have a look in the org.apache.lucene.analysis package. There are options for many different languages, stemming, stopwords and so on.

For #2, there are again a lot of query parsers that come with Lucene, mainly in the org.apache.lucene.queryParser packagage. MultiFieldQueryParser might be good for you: to require every term to be present, just call

QueryParser.setDefaultOperator(QueryParser.AND_OPERATOR)

Lucene in Action, although a few versions old, is still accurate and extremely useful for more information on analysis and query parsing.

Alabaster Codify