I'm using Lucene.net, but I am tagging this question for both .NET and Java versions because the API is the same and I'm hoping there are solutions on both platforms.
I'm sure other people have addressed this issue, but I haven't been able to find any good discussions or examples.
By default, Lucene is very picky about query syntax. For example, I just got the following error:
[ParseException: Cannot parse 'hi there!': Encountered "<EOF>" at line 1, column 9.
Was expecting one of:
"(" ...
"*" ...
<QUOTED> ...
<TERM> ...
<PREFIXTERM> ...
<WILDTERM> ...
"[" ...
"{" ...
<NUMBER> ...
]
Lucene.Net.QueryParsers.QueryParser.Parse(String query) +239
What is the best way to prevent ParseExceptions when processing queries from users? It seems to me that the most usable search interface is one that always executes a query, even if it might be the wrong query.
It seems that there are a few possible, and complementary, strategies:
- "Clean" the query prior to sending it to the QueryProcessor
- Handle exceptions gracefully
- Show an intelligent error message to the user
- Perhaps execute a simpler query, leaving off the erroneous bit
I don't really have any great ideas about how to do any of those strategies. Has anyone else addressed this issue? Are there any "simple" or "graceful" parsers that I don't know about?