tags:

views:

49

answers:

3

What's the best way of generating super relevant results from a MySQL table? I have a list of sites in a table with a title, description and tags. How can I set up a SELECT query/search function in PHP that will find user typed keywords in all three of those columns, and whilst remaining as quick as possible keep the following in mind:

  • Matches in the title would be the most relevant, but obviously matches in title as well as description/tags would be even more relevant
  • Matches in the tags would be the second most relevant
  • Keywords can be matched in any order
  • Not a requirement but if all keywords aren't matched it could match any of them (i'm not sure if this would make the call too slow)

Are there any specific changes I would need to make to my table?

A: 

There is no simple answer. "and whilst remaining as quick as possible" is not a constraint. You have to: 1) set the real goal 2) try to achieve it yourself

It might happen that the simplest query does the work well and you don't have to do anything else.

The order is ALWAYS write working code, try it and if it doesn't run well optimise it. You are doing this the other way round and thus wasting your time.

kubal5003
A: 

The simplest method would be to use a full-text index on all of those columns.

However, fulltext indexes in mysql will not give you fine-grained control over the weighting of different columns. For that you need something like lucene or sphinx.

ʞɔıu
A: 

The natural approach IMHO will be to let PHP move through the recordset, so you can apply the logic record by record.

You can expect you'll want to refine your logic in the future. Handling your needs through code will let you do so easily.

Can't say about your table structure, as you haven't provided much detail.

Smandoli
"The natural approach IMHO will be to let PHP move through the recordset, so you can apply the logic record by record."This is definitely not a good approach. You lose query caching, you transfer all the data from db to php for every page refresh/whatever that triggers the query.
kubal5003
Certainly you lose your caching. For some purposes this is okay. Usually it's not. A common solution is to make a "results" table. This table has to include an extra field for session id, so each end-user's results are separate. It's unwieldy, but such is the nature HTTP. The benefits in terms of using an algorithm are big, making this a pretty common approach.
Smandoli