views:

92

answers:

6

Hello,

I have a bunch of Users, each of whom has many Posts. Schema:

Users: id
Posts: user_id, rating

How do I find all Users who have at least one post with a rating above, say, 10?

I'm not sure if I should use a subQuery for this, or if there's an easier way.

Thanks!

+1  A: 

Use an inner join:

SELECT * from users INNER JOIN posts p on users.id = p.user_id where p.rating > 10;
Jon Weers
That will return duplicates if users have more than one post with a rating over 10. Add `DISTINCT`, or see my query for an alternative.
OMG Ponies
If you just want a list of users, use the above but replace * with distinct users.id
Kendrick
+8  A: 

To find all users with at least one post with a rating above 10, use:

SELECT u.*
  FROM USERS u
 WHERE EXISTS(SELECT NULL
                FROM POSTS p
               WHERE p.user_id = u.id
                 AND p.rating > 10)

EXISTS doesn't care about the SELECT statement within it - you could replace NULL with 1/0, which should result in a math error for dividing by zero... But it won't, because EXISTS is only concerned with the filteration in the WHERE clause.

The correlation (the WHERE p.user_id = u.id) is why this is called a correlated subquery, and will only return rows from the USERS table where the id values match, in addition to the rating comparison.

EXISTS is also faster, depending on the situation, because it returns true as soon as the criteria is met - duplicates don't matter.

OMG Ponies
Can you explain this to me. The inner query is going to get any number of NULL rows. Then, EXISTS boils it down to true or false, so won't it just get all users, or none?
Jasie
@Jasie: EXISTS doesn't care about the SELECT statement within it - you could replace NULL with 1/0, which should result in a math error for dividing by zero... But it won't, because EXISTS is only concerned with the filteration in the WHERE clause. The correlation (the `WHERE p.user_id = u.id`) is why this is called a correlated subquery, and will only return rows from the USERS table where the id values match, in addition to the rating comparison.
OMG Ponies
@Jasie: EXISTS is also faster, depending on the situation, because it returns true as soon as the criteria is met - duplicates don't matter.
OMG Ponies
Oh, very cool. Thanks for the explanation!
Jasie
+2  A: 

You can join the tables to find the relevant users, and use DISTINCT so each user is in the result set at most once even if they have multiple posts with rating > 10:

select distinct u.id,u.username
from users u inner join posts p on u.id = p.user_id 
where p.rating > 10
Ike Walker
A: 
SELECT max(p.rating), u.id 
  from users u 
INNER JOIN posts p on users.id = p.user_id 
where p.rating > 10 
group by u.id;

Additionally, this will tell you what their highest rating is.

Zak
Sorry - reformatted to stop scrolling. Easier to read, easier to vote on.
OMG Ponies
Thank you, looks better that way.
Zak
A: 

The correct answer for your question as stated is OMG Ponies's answer, WHERE EXISTS is more descriptive and almost always faster. But "SELECT NULL" looks really ugly and counterintuitive to me. I've seen SELECT * or SELECT 1 as a best practice for this.

Another way, in case we're collecting answers:

SELECT u.id 
FROM users u 
     JOIN posts p on u.id = p.user_id
WHERE p.rating > 10
GROUP BY u.id
HAVING COUNT(*) > 1

This could be useful if it's not always 1 you're testing on.

orbfish
A: 
select distinct id
from users, posts
where id = user_id and rating > 10
xagyg