views:

1866

answers:

5

I have two tables in my database, called ratings and movies.

Ratings:

| id | movie_id | rating |

Movies:

| id | title |

A typical movie record might be like this:

| 4 | Cloverfield (2008) |

and there may be several rating records for Cloverfield, like this:

| 21 | 4 | 3 | (rating number 21, on movie number 4, giving it a rating of 3)

| 22 | 4 | 2 | (rating number 22, on movie number 4, giving it a rating of 2)

| 23 | 4 | 5 | (rating number 23k on movie number 4, giving it a rating of 5)

The question:

How do I create a JOIN query for only selecting the rows in the movie table that have more than x number of ratings in the ratings table? For example, in the above example if Cloverfield only had one rating in the ratings table and x was 2, it would not be selected.

Thanks for any help or advice!

+3  A: 

You'll probably want to use MySQL's HAVING clause

http://www.severnsolutions.co.uk/twblog/archive/2004/10/03/havingmysql

theraccoonbear
+6  A: 

Use the HAVING clause. Something along these lines:

SELECT movies.id, movies.title, COUNT(ratings.id) AS num_ratings 
  FROM movies 
  LEFT JOIN ratings ON ratings.movie_id=movies.id 
  GROUP BY movies.id 
  HAVING num_ratings > 5;
ceejayoz
Wouldn't the COUNT(ratings.id) select and count ALL of the rows in the ratings table? I just want the rows where rating.movie_id is the same as movie.id.
rmh
The count is restricted in the GROUP BY function.
Ken
Ken is correct - the GROUP BY means it only counts within its group instead of across the whole table.
ceejayoz
Ahh, I see. Okay, good to know.
rmh
A: 

The above solutions are okay for the scenario you mentioned. My suggestion may be overkill for what you have in mind, but may be handy for other situations:

  1. Subquery only those from the ratings table having more than the number you need (again using tha group by having clause):

    select movie_id from ratings group by movie_id having count (*) > x

  2. Join that subquery with the movies table

    select movies.id from movies join as MoviesWRatings on movies.id = MoviesWRatings.movie_id

When you're doing more stuff to the subquery, this might be helpful. (Not sure if the syntax is right for MySQL, please fix if necessary.)

IronGoofy
A: 
SELECT * FROM movies 
INNER JOIN
(SELECT movie_id, COUNT(*) as num_ratings from ratings GROUP BY movie_id) as movie_counts
ON movies.id = movie_counts.movie_id
WHERE num_ratings > 3;

That will only get you the movies with more than 3 ratings, to actually get the ratings with it will take another join. The advantage of a subquery over HAVING is you can aggregate the ratings at the same time. Such as (SELECT movie_id, COUNT(*), AVG(rating) as average_move_rating ...)

Edit: Oops, you can aggregate with the having method to. :)

Jeff Mc
That's exactly what I need, thank you!
rmh
My understanding is that WHERE won't work on aggregate functions - that that's the entire reason for the HAVING clause. This code should generate a MySQL error.
ceejayoz
+2  A: 

The JOIN method is somewhat stilted and confusing because that's not exactly what it was intended to do. The most direct (and in my opinion, easily human-parseable) method uses EXISTS:

SELECT whatever
  FROM movies m
 WHERE EXISTS( SELECT COUNT(*) 
                 FROM reviews
                WHERE movie_id  = m.id
               HAVING COUNT(*)  > xxxxxxxx )

Read it out loud -- SELECT something FROM movies WHERE there EXIST rows in Reviews where the movie_id matches and there are > xxxxxx rows

Matt Rogish