views:

577

answers:

2

I'm trying to do a fuzzy match on the Phrase "Grand Prarie" (deliberately misspelled) using Apache Lucene. Part of my issue is that the ~ operator only does fuzzy matches on single word terms and behaves as a proximity match for phrases.

Is there a way to do a fuzzy match on a phrase with lucene?

+1  A: 

There's no direct support for a fuzzy phrase, but you can simulate it by explicitly enumerating the fuzzy terms and then adding them to a MultiPhraseQuery. The resulting query would look like:

<MultiPhraseQuery: "grand (prarie prairie)">
Coady
+1. The way to go
Yaroslav
Could you elaborate a bit more on this? I'm not using Lucene directly, but rather through Solr. I may very well have to just get around to reading lucene in action. I wouldn't mind getting a better understanding of how the two work together and becoming comfortable with it at a more fundamental level.For now, in Solr, I'm achieving something that's effective enough for me using the solr.PhoneticFilterFactory analyzer.
Koobz
+3  A: 

Lucene 3.0 has ComplexPhraseQueryParser that supports fuzzy phrase query. This is in the contrib package.

Shashikant Kore