views:

140

answers:

2

I am creating a search function of my classifieds on my website. Here is some of the criteria I need to meet:

  • When searching for 'bmw 520' only matches where these two words come in exactly this order is returned. not matches for only 'bmw' or only '520'.

  • When searching for 'bmw 330ci' results as the above will be returned, but, WITH AND WITHOUT the ci extension. There are a nr of extensions in cars as you all know (i, ci, si, fi etc).

  • I want the 'minus sign' to 'exclude' all returns containing the word after the sign, ex: 'bmw -330' will return all 'bmw' results without the '330' ones. (a NOT instead of minus sign is also ok)

  • all special character accents like 'é' are converted to their simple values, in this case 'e'.

  • list of words to ignore completely in the search string.

Would I need Sphinx or should I write this in a php file?

What do you suggest I do?

Thanks

+1  A: 

I think that Sphinx matches all of your criteria.

Jan Hančič
+2  A: 

I think Sphinx is pretty good match to what you want to do, but some things won't happen automatically...

  • To match on two words together exactly, you either need to use the phrase match mode, or group the words in double-quotes while using the extended match mode.

  • This is the tricky one - unless you specify specific exceptions, I don't think you can index 330ci as both '330 ci' and '330ci'.

  • As long as you're using boolean or extended match modes, then the minus sign works as you'd like.

  • 'Special' characters can be converted to standard ASCII, but this doesn't happen by default. You need to set up your charset_table value. This blog post is aimed at Thinking Sphinx (a Ruby plugin for Sphinx), but the setting value is just passed straight through to Sphinx.

  • You can only ignore specific words on a per-query basis if you've got at least one other word in the query (that is: "-foo" will fail for Sphinx, but "foo -bar" is fine). It's worth noting that you can choose to not index specific words.

pat