views:

238

answers:

1

What I need done is simple... but its 3am and Im probably overlooking the obvious.

Im coding a simple forum. One table stores the forum titles, descriptions, etc, while the other stores the posts. In the forum listing, that shows the list of all forums, I want to grab the latest post in each forum, and display the post subject, poster and post ID, and date. Simple.

The only problem is, when I join to the posts table, it joins to the first record in the table, not the last, which would denote the last post in that forum.

Here is the simplified query that gets a list of forums + data for the "latest" post (which now functions as "first post").

SELECT forum_title, forum_id, post_subject, post_user, post_id, post_date FROM board_forums 
     LEFT JOIN board_posts 
     ON (forum_id = post_parentforum AND post_parentpost = 0) 
WHERE forum_status = 1
GROUP BY forum_id
ORDER BY forum_position

How can I fix this?

+4  A: 

The problem you're hitting is the classic Ambiguous GROUP BY issue. This is particular to MySQL, because other RDBMS (and standard SQL) won't allow your query at all. Your query fails the Single-Value Rule, because you haven't listed all non-aggregated columns in the GROUP BY.

Here's a solution demonstrating my favorite way of getting greatest row per group:

SELECT f.forum_title, f.forum_id, p1.post_subject, p1.post_user, 
  p1.post_id, p1.post_date 
FROM board_forums f
LEFT JOIN board_posts p1
  ON (f.forum_id = p1.post_parentforum AND p1.post_parentpost = 0)
LEFT JOIN board_posts p2
  ON (f.forum_id = p2.post_parentforum AND p2.post_parentpost = 0 
      AND p1.post_id < p2.post_id)
WHERE p2.post_id IS NULL AND f.forum_status = 1
ORDER BY f.forum_position;

If p2.post_id IS NULL, that means no post is found in p2 which is greater than the post found in p1.

Ergo, p1 is the latest post (assuming post_id is auto-incrementing).


Re comment:

Slight problem with this. post_id with the highest ID is not necessarily the latest post.

No problem. Just use a column that is guaranteed to distinguish an earlier post from a later post. You mention post_date. In the case of ties, you'll have to break ties with another column (or columns) that will be sure to be in chronological order.

LEFT JOIN board_posts p2
  ON (f.forum_id = p2.post_parentforum AND p2.post_parentpost = 0 
    AND (p1.post_date < p2.post_date 
      OR p1.post_date = p2.post_date AND p1.post_millisecond < p2.post_millisecond))
Bill Karwin
I'm glad you mentioned the problem - I came back to this question a half-dozen times, but couldn't even figure out how the query was working as it was written there, so gave up. Not being a MySQL user, I didn't realize it would let you slide by without fixing that GROUP BY. Now I know! :)
TML
Yep. MySQL trusts that you know what you're doing, and that you're following the Single-Value Rule. SQLite also has the same behavior. If you write a query that has ambiguous columns (more than one value per group), the RDBMS picks a value arbitrarily. MySQL picks the value from the "first" row (in the order of physical storage of rows). Coincidentally, SQLite picks the value from the "last" row.
Bill Karwin
Slight problem with this. post_id with the highest ID is not necessarily the latest post. If a new thread is made, it will have the latest ID, and in this case, your example works. But once a reply is made on an older thread (with a lower id), it no longer functions like its supposed to, and still shows the therad with the highest ID. It needs to show it based on the post_date.
Yegor
Thank you Bill. You're a life saver!
Yegor