I have implemented full text search over SQL Server 2005 database using CONTAINSTABLE keyword. I was wondering is there a way to add a "sounds like" or google's "did you mean THAT" functionality if the original query yields no results.
views:
364answers:
3SQL Server has the functions SOUNDEX and DIFFERENCE
This related SO answer might be useful: How to make a sql search query more powerful?
The soundex for SQL Server is very limited and frustrating, I really recomend you to take a look at Lucene.net http://incubator.apache.org/lucene.net/. Lucene is a high-performance, full-featured text search engine library, it is also very easy to use in .NET projects. If you need a serious search engine for you app go with Lucene.
Some features retrieved from http://lucene.apache.org/java/docs/features.html:
- ranked searching, best results
- returned first many powerful query
- types: phrase queries, wildcard
- queries, proximity queries, range
- queries and more fielded searching (e.g., title, author, contents)
- ate-range searching sorting by any
- field multiple-index searching with
- merged results allows simultaneous
- update and searching
If you want to be able to do this you need to normalize the raw text and the queries. Simple example, if you want to be able to search on a SOUNDEX type of value, you'll need to SOUNDEX both the query string and the original raw data that you're querying. You can't efficiently process the query space on the fly, so instead you normalize it during the creation of the index.
Technically, you need only normalize the actual index, not the data, but since your data likely IS you index, then it will need to be normalized.
This is the same process as "stemming" of words, removing plurals, etc.