views:

121

answers:

4

I have two tables in MySQL 5.1.38.

products
+----+------------+-------+------------+
| id | name       | price | department |
+----+------------+-------+------------+
|  1 | Fire Truck | 15.00 | Toys       |
|  2 | Bike       | 75.00 | Toys       |
|  3 | T-Shirt    | 18.00 | Clothes    |
|  4 | Skirt      | 18.00 | Clothes    |
|  5 | Pants      | 22.00 | Clothes    |
+----+------------+-------+------------+

ratings
+------------+--------+
| product_id | rating |
+------------+--------+
|          1 |      5 |
|          2 |      5 |
|          2 |      3 |
|          2 |      5 |
|          3 |      5 |
|          4 |      5 |
|          5 |      4 |
+------------+--------+

My goal is to get the total price of all products which have a 5 star rating in each department. Something like this.

+------------+-------------+
| department | total_price |
+------------+-------------+
| Clothes    | 36.00       |  /* T-Shirt and Skirt */
| Toys       | 90.00       |  /* Fire Truck and Bike */
+------------+-------------+

I would like to do this without a subquery if I can. At first I tried a join with a sum().

select department, sum(price) from products
join ratings on product_id=products.id
where rating=5 group by department;
+------------+------------+
| department | sum(price) |
+------------+------------+
| Clothes    |      36.00 |
| Toys       |     165.00 |
+------------+------------+

As you can see the price for the Toys department is incorrect because there are two 5 star ratings for the Bike and therefore counting that price twice due to the join.

I then tried adding distinct to the sum.

select department, sum(distinct price) from products
join ratings on product_id=products.id where rating=5
group by department;
+------------+---------------------+
| department | sum(distinct price) |
+------------+---------------------+
| Clothes    |               18.00 |
| Toys       |               90.00 |
+------------+---------------------+

But then the clothes department is off because two products share the same price.

Currently my work-around involves taking something unique about the product (the id) and using that to make the price unique.

select department, sum(distinct price + id * 100000) - sum(id * 100000) as total_price
from products join ratings on product_id=products.id
where rating=5 group by department;
+------------+-------------+
| department | total_price |
+------------+-------------+
| Clothes    |       36.00 |
| Toys       |       90.00 |
+------------+-------------+

But this feels like such a silly hack. Is there a better way to do this without a subquery? Thanks!

A: 

You can do two queries. First query:

SELECT DISTINCT product_id FROM ratings WHERE rating = 5;

Then, take each of those ID's and manually put them in the second query:

SELECT   department, Sum(price) AS total_price
FROM     products
WHERE    product_id In (1,2,3,4)
GROUP BY department;

This is the work-around for not being able to use subqueries. Without them, there is no way to eliminate the duplicate records caused by the join.

Erick Robertson
A: 

I can't think of any way to do it without a subquery somewhere in the query. You could perhaps use a View to mask the use of a subquery.

Barring that, your best bet is probably to find the minimum data set needed to make the calculation and do that in the front end. Whether or not that's possible depends on your specific data - how many rows, etc.

The other option (actually, maybe this is the best one...) would be to get a new ORM or do without it altogether ;)

This view would allow you to bypass the subquery:

CREATE VIEW Distinct_Product_Ratings
AS
    SELECT DISTINCT
        product_id,
        rating
    FROM
        Ratings
Tom H.
+2  A: 

Use:

  SELECT p.department,
         SUM(p.price) AS total_price
    FROM PRODUCTS p
    JOIN (SELECT DISTINCT 
                 r.product_id,
                 r.rating
            FROM RATINGS r) x ON x.product_id = p.id
                             AND x.rating = 5
GROUP BY p.department

Technically, this does not use a subquery - it uses a derived table/inline view.

Marking this as community wiki cuz some monkey keeps downvoting me though it's 100% correct.

OMG Ponies
Spank the monkey
Anax
A: 

The primary reason you are having trouble finding a solution is that the schema as presented is fundamentally flawed. You shouldn't allow a table to have two rows that are complete duplicates of each other. Every table should have a means to uniquely identify each row even if it is the combination of all columns. Now, if we change the ratings table so that it has an AUTO_INCREMENT column called Id, the problem is easier:

Select products.department, Sum(price) As total_price
From products
    Left Join ratings As R1
        On R1.product_id = products.id
            And R1.rating = 5
    Left Join ratings As R2
        On R2.product_id = R1.product_id
            And R2.rating = R1.rating
            And R2.Id > R1.Id
Where R2.Id Is Null
Group By products.department
Thomas
Actually I do have an auto-incrementing id field in the real application which is much more complex. I tried to simplify everything as far as I could here, but seems I went too far by taking out ratings.id. Thanks for posting this option!
ryanb