views:

100

answers:

5

I have a classifieds website, and users may for example search for cars.

When searching for a car, there are a number of endings in their names as you all probably know. For example lets say Bmw 330ci (ending beeing 'ci'), but there is also Bmw 330i, or Bmw 330di etc etc.

How can I make SOLR "understand" this, so if users search for 330 SOLR will return results containing 330ci/330i/330di etc.

Also, it shoul NOT return results if a user specifically inputs Bmw 330ci, then it should ONLY return Bmw 330ci and NOT Bmw 330i/di etc...

I am new to SOLR, but I am starting to understand how to get it to work. Need a little guidance on this one though!

How would you have done it?

Thanks

+1  A: 

I don't know SOLR, it seem to be for full-text search.

However, because you know your model upfront, you could use regular SQL to do this.


In the database field for name, instead of mixing the base name with the ending, you can split the two in two columns, like "rootName" and "suffixName".

Then your SQL can very naturally, and extremely efficiently (compared to full-text search), find what you need : search for the "rootName", and also select on the "suffixName" (but only if specified).

KLE
+2  A: 

Well, it depends on several factors but, as a general rule, in the first case you can use wildcards, e.g.:
q=330*

in the second case you can point directly to the field and do an exact search: <fieldName>:330ci

mamoo
+2  A: 

You'd probably want to analyze the field using the WordDelimiterFilterFactory, set up to split on numeric transitions. That will allow a query of 330 to match 330anything.

I believe that, by default, when you also do this at query time, it will create a phrase query from 330di -> "330 di", which should only match if both parts are present in the index. See http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters for more details.

ThoughtfulHacking
A: 

are you using dismax or edismax query parser?

jdancu