views:

511

answers:

2

I have Rails app with a Postgres backend.

I need to add full text search which would allow fuzzy searches based on Levenshtein distance or other similar metrics. Add the fact that the lexer/stemmer has to work with non-English words (it would be ok to just switch language-dependent features off when lexing, to not mess with the target language which may have meaningful words considered by English engine as irrelevant).

I guess Postgres' tsearch won't apply here as it doesn't have fuzzy search -- please correct me if I'm wrong.

What are possible combinations of backends & plugins? It'd like to prefer solutions which add less to the infrastructure (eg. if Postgres can have fuzzy fts, why use external Lucene); OTOH, the quality of Rails plugins involved is important as well.

What would you recommend?

update: seems like I'd need rather n-gram based metrics than Levenshtein.

+4  A: 

Rails + Postgres + Solr + Sunspot

Solr is based on Lucene so you can take advantage of all Lucene features. Sunspot is an excellent Ruby wrapper for Solr API. Both Sunspot and Solr work great with Rails and PostgreSQL, I used it for a project no more than one month ago.

Simone Carletti
Can you tell in short what exactly is a benefit of using Solr over plain Lucene?
Wojciech Kaczmarek
Btw for this particular project there won't be much of data to search, so I guess I could go with something as simple as http://unirec.blogspot.com/2007/12/live-fuzzy-search-using-n-grams-in.html if it works. Anyway thx, I'll try Sunspot, it may serve for sth bigger as well
Wojciech Kaczmarek
Basically, with Solr you can use Lucene over the network. See http://www.lucidimagination.com/Community/Hear-from-the-Experts/Articles/Introduction-Apache-Lucene-and-Solr
Simone Carletti
+2  A: 

PostgreSQL comes with an extension called pg_trgm (in the contrib/ directory). In my experience, it is too slow (more like a proof-of-concept implementation), but for your application it might work.

mjy