views:

27

answers:

1

How would I design the tables for a matching system doing something like this:

We have X products which can have Y properties. The number of properties can vary from product to product and two products doesn't have to have any similar properties.

My number one goal is to take one product and find the most similar product.

Normally I would do something like this

create table products
(
   id serial not null primary key,
   name varchar(40)
)

create table properties
(
  id serial not null primary key,
  name varchar(40)
)

create table product_properties
(
  product_id int not null,
  property_id int not null
)

The only way I can think of is to loop through all products, fetch their properties in the loop and compare with the source product. But that doesn't seem very effective.

+2  A: 

Would something like this be any help?

SELECT *
    FROM products
    WHERE id <> <given product id>
    ORDER BY (SELECT COUNT(*)
                  FROM product_properties
                  WHERE product_properties.product_id = products.id AND
                        product_properties.property_id IN
                                (SELECT property_id
                                     FROM product_property
                                     WHERE product_id = <given product id>))
    LIMIT 1;

This is intended to get the product that has the most properties in common with your original product.

Brian Hooper
+1 - I was just about to post the same thing and then refreshed just to check if someone else had posted anything.
TheCloudlessSky
Oops; made an ups there; should have a clause to eliminate the original product from the list...
Brian Hooper
@TheCloudlessSky - perhaps you should put your answer up anyway. I think we may be a little too concerned with quick answers on this site, possibly at the expense of correctness, as the error I put in my query tends to show.
Brian Hooper
@Brian - Mine was almost exactly the same (with the WHERE id <> giveId) and the `IN` had the `product_id`'s hard-coded.
TheCloudlessSky