views:

170

answers:

4

Hello

i want to add new feature to the search in my website. i'm using PHP and MYSQL. mysql database containing a table to the items that the user will search for, for each item there is a "keyword" column that's comma separated keywords "EXAMPLE: cat,dog,horse". after the user search in my website i want to get the words that are let me say "85%" similar to his search keyword, this is for redefine search. and for misspelling i want a service or something that provide if the keyword is correct or misspelled so i get some corrections and check if those exists in the database and then give those corrections to user to change his search keyword.

i'm not asking for a solution here ... but if you can direct me in a one way or another that will be great

Thanks guys

Cheers

A: 

The key is in your idea of "85% similar". Here are some ideas:

Similar Words Table

You can define a table where you list common misspellings for your keywords. You'll then have to augment how you search the database to map common misspellings to the proper value.

Similar Words Lookup

When you perform the search, use a library to generate similar words and search for all of them. You can use any sort of spelling library to generate possible word matches before sending the search. Or write your own based on the Edit Distance algorithm.

Only check if needed: Since you're using PHP, you may consider pspell. You can first call pspell_check to see if the word is spelled correctly. Then call pspell_suggest to get suggestions.

See this link for an example.

Use a Database Feature

MySQL, for example has a SOUNDS_LIKE operator. You can search for WHERE keyword SOUNDS_LIKE 'kat' and (presumably) get cat. More info is on the documentation page, which alerts you to some limitations (like English and UTF-8 only).


It sounds like a fairly common problem, so perhaps there are other more canonical solutions to this problem. Perhaps there's something specific to the language you're using (or in the database interface layer) that can abstract this for you.

The first two should allow you to meet some notion of 85% similarity. I have no idea how well the third option will work, but it "soundz kool."

Geoff
Thanks for your solutions ... but i have a question .. how to check firstly if the word is misspelled before search for another correct word ?
From.ME.to.YOU
I added a link and suggestion under "Similar words lookup". Check out PHP's `pspell`: http://us2.php.net/manual/en/function.pspell-suggest.php
Geoff
Have you had any luck?
Geoff
+1  A: 

There's similar_text() in PHP, but that's after the query; you could also check out Full-Text search in MySQL.

Alec
A: 

Apache Solr is an open source search platform that provides not only with full-text search capabilities but also with built-in matching score and auto-suggestion systems, among many other powerful features.

If the amount of information in your site is not significant enough, this option may sound undue, although I'd recommend to at least check it out.

The communication between your app and Solr can be handled through a standard REST interface. AFAIK there are two good Solr-specific PHP libraries available at the moment:

Setting up the server is pretty straight forward, being the laborious part (as well as the interesting one) that of tuning and optimizing Solr to best fit your needs.

nuqqsa
A: 

Try looking into the Edit Distance Algorithm. Basically for two inputs strings, the return value is the minimum number of edits needed to transform one string into the other. That can give you some idea about how close two strings are.

Edit Distance

Babar