I have a Lucene index which populates from a database. I store/index some fields and then add a FullText field in which I index the contents of all the other fields, so I can do a general search.
Now let's say I have a document with the following two fields: fld1 - "Samsung releases a new 22'' LCD screen" fld2 - "Sony Ericsson phone's batteries explode"
If an user does a "Samsung phone", he probably just wants news about samsung phones, not a document with info about a samsung screen and a sony phone, but searching by the FullText field, I will get this as a valid result. Is there a nice way to handle this?
I've thought of indexing with some separator and the doing a SpanNotQuery, so the FullText field would have this contents: "Samsung releases a new 22'' LCD screen MYLUCENESEPARATOR Sony Ericsson phone's batteries explode" and then doing a SpanNotQuery with MYLUCENESEPARATOR as the non-spanning term.
Is this a good solution? Does it scale well with more than two terms? I fear it would be a performance killer. Is there a better way to achieve this?