tags:

views:

44

answers:

1
SELECT game_ratingstblx245v.game_id,avg( game_ratingstblx245v.rating ) 
         as avg_rating,
       count(DISTINCT game_ratingstblx245v.userid) 
          as count,
       game_data.name,
       game_data.id ,
       avg(game_ratings.critic_rating),count(DISTINCT game_ratings.critic) 
         as cr_count
FROM game_data 
LEFT JOIN game_ratingstblx245v ON game_ratingstblx245v.game_id = game_data.id 
LEFT JOIN game_ratings         ON game_ratings.game_id = game_data.id 
WHERE game_data.release_date < NOW() 
GROUP BY game_ratingstblx245v.game_id 
ORDER BY game_data.release_date DESC,
         game_data.name


I am currenty using this query to extract values from 3 tables
game_data - id(foreign key), name, release_date \games info
game_ratings - game_id(foreign key),critic , rating \critic rating
game_ratingstblx245v - game_id(foreign key), rating, userid \user rating


What I want to do with this query is select all id's from table game_data order by release_date descending, then check the avg rating from table game_ratings and game_ratingsblx245v corresponding to individual id's(if games have not been rated the result should return null from fields of the latter two tables)..Now the problem I am facing here is the result is not coming out as expected(some games which have not been rated are showing up while others are not), can you guys check my query and tell me where am i wrong if so...Thanks

+2  A: 

You shouldn't use the game_ratingstblx245v.game_id column in your GROUP BY, since it could be NULL when there are no ratings for a given game id. Use game_data.id instead.

Here's how I would write the query:

SELECT g.id, g.name,
       AVG( x.rating ) AS avg_user_rating,
       COUNT( DISTINCT x.userid ) AS user_count,
       AVG( r.critic_rating ) AS avg_critic_rating, 
       COUNT( DISTINCT r.critic ) AS critic_count
FROM game_data g
LEFT JOIN game_ratingstblx245v x ON (x.game_id = g.id)
LEFT JOIN game_ratings r         ON (r.game_id = g.id)
WHERE g.release_date < NOW() 
GROUP BY g.id 
ORDER BY g.release_date DESC, g.name;

Note that although this query produces a Cartesian product between x and r, it doesn't affect the calculation of the average ratings. Just be aware in the future that if you were doing SUM() or COUNT(), the calculations could be exaggerated by an unintended Cartesian product.

Bill Karwin
Thanks Bill for the extra info you're always the first one to help me out....On a side note I don't understand what do you mean by cartesian product?
halocursed
A Cartesian product is one where every row in one table is combined with every row in the second table. Say you have 10 matching rows in the first table, and 12 matching rows in the second table. If no condition restricts the matching conditions between these two, you'll end up with 120 rows in the result. What this means in a query like yours is that rows from each ratings table get repeated unintentionally. This would be a problem if you were doing COUNT() or SUM().
Bill Karwin
To see the Cartesian product in action, try your query without the GROUP BY, and show the `rating` columns instead of calculating the AVG(). You'll see a lot of repeated rows.
Bill Karwin
Yes you are right i was making this mistake earlier and the return values for count were giving me weird values which were much higher than real values but I fixed it later on....
halocursed
Thanks a lot Bill....
halocursed