ansaurus

Question

how to configure solr / lucene to perform levenshtein edit distance searching?

Answer 1

+2 A:

If you're using a nightly build, then you can sort results based on levenshtein distance using the strdist function:

q=term:webspace~0.1&sort=strdist("webspace", term, edit) desc

More details here and here

Karl Johansson 2010-09-01 18:09:57

Answer 2

A:

Solr/Lucene doesn't appear to be a good fit for this application. You are likely better off. with SimMetrics library . It offers a comprehensive set of string-distance calculators incl. Jaro-Winkler, Levenstein etc.

Mikos 2010-09-02 01:16:00

this is a very interesting link indeed. i wish there was a standard library as comprehensive like this for python as well. unfortunately, since i have to search over hundreds of thousands of words, a solution without indexing will likely be too slow (but i would have to try first). also, i am not quite sure how to integrate a java library into my python project. maybe via HTTP.

flow 2010-09-04 16:39:32

ansaurus

tags:

views:

answers:

how to configure solr / lucene to perform levenshtein edit distance searching?

related questions