views:

106

answers:

2

I want to have a "citystate" field in Lucene index which will store various city state values like:

  • Chicago, IL
  • Boston, MA
  • San Diego, CA

How do i store these values(shud it be tokenized or non-tokenized?) in Lucene and

how do I generate a query (should it be phrasequery or termquery or something else?) which gets me all records whose citystate contain: Chicago, IL OR Boston, MA OR San Diego, CA ??

I would appreciate if i can get help with the code as well.

Thanks.

+1  A: 

It depends. Will you ever want to search by city alone or by state alone? In this case you need to tokenize. If not, do not tokenize. Check out the KeywordAnalyzer, though - it may suit you.

As to your second question. Suppose you call the field 'citystate'. You can then use a query such as: citystate:Chicago, IL OR citystate:Boston,MA OR citystate:San Diego, CA.

The programmatic version is a BooleanQuery composed out of several TermQueryes.

Yuval F
+1  A: 

Shouldnt city state be normalized further into two separate fields ?

mP