views:

113

answers:

2

Let's say I have a MySQL table, people. Each record comprises a variety of properties, among these favourite_colour, country, and age_group.

What I would like to do is retrieve records from this table by their similarity to a set of specific parameters. Given "Red", "United States", and "18-25", for example, the best results would be those records that match all three. These would be 100% matches.

However, I would also like to retrieve records that match any combination of two parameters (66% match), or any one parameter (33% match). Moreover, I would like to be able to define additional points of comparison (e.g. underwear_type, marital_status, etc.).

Is there a relatively efficient solution to this problem?

A: 

For the three first is easy:

select * from people
where
(case when color = 'Red' then 33 else 0 end + 
case when age_group = '18-25' then 33 else 0 end + 
case when country = 'United States' then 33 else 0 end)>=33

I don't understand the "additional points of comparison" part, can you explain?

tekBlues
This is not very usable or elegant.
Artem Russakovskii
Compared with Alex's solution, I must agree!. But it works anyway.
tekBlues
That's why StackOverflow is here :) to find the best solution.
Artem Russakovskii
@tekBlues: By additional points of comparison, I meant that I might, at a future date, add another fields (like `underwear_type`) to compare against. Matches would then be 25%/50%/75%/100%.
Daniel Wright
+10  A: 
Alex Martelli
This is quite clever!
Artem Russakovskii
This is indeed a very good idea. It would be easy to add weighing to this too by multiplying one of the results.
Jani Hartikainen
This is indeed a great solution. A couple notes/questions: from what I can tell, SQL doesn't allow column aliases (ie. match_score) in WHERE clauses. Also, I don't think SUM() behaves as your query would suggest (it doesn't accept multiple arguments); MySQL's documentation indicates SUM() is a GROUP BY aggregation function only. Removing the WHERE clause, and replacing sum with addition operators made the function work like a charm.
Daniel Wright
@Hoobnium, good points -- I was apparently making up my own too-sensible SQL dialect, on both the SUM and WHERE issues;-). Thanks for double checking!
Alex Martelli