tags:

views:

197

answers:

3

Let's imagine that we are having "cars" table with such a simple structure...

car_id INT
color ENUM('black','white','blue')
weight ENUM('light','medium','heavy')
type ENUM('van','sedan','limo')

Fist, I'm selecting car (1, black, heavy, limo), then I'd like to get list of related cars sorted by number of matching columns (without any column weight). So, first I'm expecting to see (black, heavy, limo) cars, then I'm expecting to see cars with only 2 matching fields etc.

Is it possible to execute this kind of sorting using SQL?

Sorry for my English, but I really hope that my question is clear for you.

Thank you.

+2  A: 

Not overly efficient, but...

SELECT
  exact.car_id   AS e_car_id, exact.color   AS e_color, 
  exact.weight   AS e_weight, exact.type    AS e_type,
  related.car_id AS r_car_id, related.color AS r_color, 
  related.weight AS r_weight, related.type  AS r_type,
  CASE WHEN related.color = exact.color      THEN 1 ELSE 0 END
   + CASE WHEN related.weight = exact.weight THEN 1 ELSE 0 END
   + CASE WHEN related.type = exact.type     THEN 1 ELSE 0 END
  AS rank
FROM
  cars AS exact
  INNER JOIN cars AS related ON (
    related.car_id <> exact.car_id 
    AND CASE WHEN related.color = exact.color   THEN 1 ELSE 0 END
      + CASE WHEN related.weight = exact.weight THEN 1 ELSE 0 END
      + CASE WHEN related.type = exact.type     THEN 1 ELSE 0 END
    >= 1
  )
WHERE
  exact.car_id = 1 /* black, heavy, limo */
ORDER BY
  rank DESC

This would not run very fast on large data sets, since neither the JOIN nor the ORDER BY can make use of an index. Very probably a more optimal version exists.

Output on my test setup looks like this:

e_car_id  e_color  e_weight  e_type  r_car_id  r_color  r_weight  r_type  rank
1         black    heavy     limo    7         black    heavy     limo    3
1         black    heavy     limo    2         black    light     limo    2
1         black    heavy     limo    3         black    heavy     van     2
1         black    heavy     limo    4         black    medium    van     1
1         black    heavy     limo    5         blue     light     limo    1
Tomalak
Absolutely great, please gimme some time to understand this construction.
Kirzilla
+1  A: 

There are probably a few ways to optimise the sub-queries, but without using case statements or sub-optimal join clauses:

select
        *
    from
        (
            select
                    selection.CarId,
                    selection.Colour,
                    selection.Weight,
                    selection.Type,
                    3 as Relevance
                from
                    tblCars as selection
                where
                    selection.Colour = 'black' and selection.Weight = 'light' and selection.Type = 'van'
            union all
            select
                    cars.CarId,
                    cars.Colour,
                    cars.Weight,
                    cars.Type,
                    count(*) as Relevance
                from
                    tblCars as cars
                inner join
                    (
                        select
                                byColour.CarId
                            from
                                tblCars as cars
                            inner join
                                tblCars as byColour
                            on
                                cars.Colour = byColour.Colour
                            where
                                cars.Colour = 'black' and cars.Weight = 'light' and cars.Type = 'van'
                                and
                                byColour.CarId <> cars.CarId
                        union all
                        select
                                byWeight.CarId
                            from
                                tblCars as cars
                            inner join
                                tblCars as byWeight
                            on
                                cars.Weight = byWeight.Weight
                            where
                                cars.Colour = 'black' and cars.Weight = 'light' and cars.Type = 'van'
                                and
                                byWeight.CarId <> cars.CarId
                        union all
                        select
                                byType.CarId
                            from
                                tblCars as cars
                            inner join
                                tblCars as byType
                            on
                                cars.Type = byType.Type
                            where
                                cars.Colour = 'black' and cars.Weight = 'light' and cars.Type = 'van'
                                and
                                byType.CarId <> cars.CarId
                    ) as matches
                on
                    cars.CarId = matches.CarId
                group by
                    cars.CarId,
                    cars.Colour,
                    cars.Weight,
                    cars.Type
        ) as results
    order by
        Relevance desc

Output:

CarId   Colour  Weight  Type    Relevance
1       black   light   van     3
3       white   light   van     2
4       blue    light   van     2
5       black   medium  van     2
6       white   medium  van     1
7       blue    medium  van     1
8       black   heavy   limo    1
Joe Lloyd
Absolutely great, please gimme some time to understand this construction.
Kirzilla
Basically you get a list of cars matching by colour, a list of cars matching by weight, and a list of cars matching by type. Union these three lists and group by colour, weight and type. The count of records in each group gives you your relevance, as the more a car appears in the three lists, the more matching properties it will have. The rest is just for formatting purposes.
Joe Lloyd
Thank you for full explanation!
Kirzilla
A: 
mysql> select * from cars;
+--------+-------+--------+-------+
| car_id | color | weight | type  |
+--------+-------+--------+-------+
|      1 | black | light  | van   |
|      2 | black | light  | sedan |
|      3 | black | light  | limo  |
|      4 | black | medium | van   |
|      5 | black | medium | sedan |
|      6 | black | medium | limo  |
|      7 | black | heavy  | van   |
|      8 | black | heavy  | sedan |
|      9 | black | heavy  | limo  |
|     10 | white | light  | van   |
|     11 | white | light  | sedan |
|     12 | white | light  | limo  |
|     13 | white | medium | van   |
|     14 | white | medium | sedan |
|     15 | white | medium | limo  |
|     16 | white | heavy  | van   |
|     17 | white | heavy  | sedan |
|     18 | white | heavy  | limo  |
|     19 | blue  | light  | van   |
|     20 | blue  | light  | sedan |
|     21 | blue  | light  | limo  |
|     22 | blue  | medium | van   |
|     23 | blue  | medium | sedan |
|     24 | blue  | medium | limo  |
|     25 | blue  | heavy  | van   |
|     26 | blue  | heavy  | sedan |
|     27 | blue  | heavy  | limo  |
+--------+-------+--------+-------+
27 rows in set (0.00 sec)

select *,
(case
  when color = 'black' and weight = 'heavy' and type = 'limo'
    then 3
  when ( color = 'black' and type = 'limo') or 
      (color = 'black' and weight = 'heavy') or 
       (weight = 'heavy' and type = 'limo')
    then 2
  else 1
end) sort_order
from cars
where color = 'black' or weight = 'heavy' or type = 'limo'
order by sort_order desc;


+--------+-------+--------+-------+------------+
| car_id | color | weight | type  | sort_order |
+--------+-------+--------+-------+------------+
|      9 | black | heavy  | limo  |          3 |
|     27 | blue  | heavy  | limo  |          2 |
|     18 | white | heavy  | limo  |          2 |
|      8 | black | heavy  | sedan |          2 |
|      7 | black | heavy  | van   |          2 |
|      6 | black | medium | limo  |          2 |
|      3 | black | light  | limo  |          2 |
|     24 | blue  | medium | limo  |          1 |
|     25 | blue  | heavy  | van   |          1 |
|     21 | blue  | light  | limo  |          1 |
|     26 | blue  | heavy  | sedan |          1 |
|     17 | white | heavy  | sedan |          1 |
|     16 | white | heavy  | van   |          1 |
|     15 | white | medium | limo  |          1 |
|     12 | white | light  | limo  |          1 |
|      5 | black | medium | sedan |          1 |
|      4 | black | medium | van   |          1 |
|      2 | black | light  | sedan |          1 |
|      1 | black | light  | van   |          1 |
+--------+-------+--------+-------+------------+
19 rows in set (0.00 sec)
Ian