views:

42

answers:

3

A new project with some interesting requirements has arrived on my desk. I need to develop a searchable directory of businesses, with a focus on delivering relevant results based on arbitrary search queries. The businesses can be of any niche; there's no one area that is more represented than another.

When googling for things like "search algorithm" or "content relevance algorithm," all I get are references to Google's "Mystical Algorithm of the Old Gods" and SEO firms.

Does the relevance value of MySQL's full text Match() function have what it takes for the task? I've never used it, but I'm definitely going to do some testing. Also, since this will largely be a human edited directory, I can assume that we can add weighted factors like tagging and categories. What would be a good way to combine these factors with MySQL's Match() relevancy?

I'm also open to ideas that I've not discussed here.

+1  A: 

If you have hand edited data, have a look at Oracle text search. In one of my previous projects we had some good results.

I was not directly involved in the database setups, but I know that the results were very welcome. (Before this they had just keyword based search).

Nivas
Thanks for this! I have a requirement to use MySQL, however.
Stephen
+1  A: 

For an example of information retrieval based techniques lookup TF-IDF or BM25.

For machine learning based techniques, lookup RankNet and its variants from MSR.

Amit Prakash
This is exactly what I was looking for. Thanks!
Stephen
A: 

Use a search engine like Solr to index the data. You can still use MySql to hold the data, but for searches use a search engine.

Jon Snyder