The answers from @jjclarkson and @davethegr8 are close, but you can't put aggregate functions in the WHERE clause. The WHERE clause is evaluated for each row.
You need to evaluate the MAX()
expression for each group, so you need to use a HAVING
clause.
Try this:
SELECT UserID
FROM ArrivalTimes
GROUP BY UserID
HAVING MAX(ArrivalTime) <= '09:00:00';
@MBCook comments that HAVING
can be slow. You're right, it might not be the absolute quickest way to produce the desired result. But the HAVING
solution is the most clear. There are situations where performance has lower priority than clarity and maintainability.
I looked at the EXPLAIN output (on MySQL 5.1.30) for the HAVING
solution: no indexes were used, and the extra notes said "Using temporary; Using filesort
," which usually means performance will be poor.
Consider the following query:
SELECT DISTINCT a1.UserID
FROM ArrivalTimes a1
LEFT OUTER JOIN ArrivalTimes a2
ON (a1.UserID = a2.UserID AND a2.ArrivalTime > '09:00:00')
WHERE a2.UserID IS NULL;
This generates an optimization plan that uses an index on UserID
and says:
- a1: "
Using index; Using temporary
"
- a2: "
Using where; Distinct
"
Finally, the following query generates an optimization plan that appears to use indexes most effectively, and no temp tables or filesort.
SELECT DISTINCT a1.UserID
FROM ArrivalTimes a1
WHERE NOT EXISTS (SELECT * FROM ArrivalTimes a2
WHERE a1.UserID = a2.UserID
AND a2.ArrivalTime > '09:00:00');
- a1: "
Using where; Using index
"
- a2: "
Using where
"
This appears most likely to have the best performance. Admittedly, I only have four rows in my test table, so this isn't a representative test.