views:

146

answers:

2

I'm creating a small forum.

Attempting to run SElECT... JOIN... query too pick up information on the individual posts, plus the last reply (if any). As part of my desire to do everything the hard way, this covers five tables (only columns revelant to this issue are being stated)

commentInfo    referenceID | referenceType | authorID | create
postit         id | title
postitInfo     referencePostitID | create | authorID
user           id | username | permission
userInfo       referenceUserID | title

So, I run this query SELECT... JOIN... query to get the most recent topics and their last replies.

SELECT DISTINCT
    t1.id, t1.title, t2.create, t2.lastEdit, t2.authorID, t3.username, 
    t4.title AS userTitle, t3.permission, t5.create AS commentCreate, 
    t5.authorID AS commentAuthor, t6.username AS commentUsername, 
    t6.permission AS commentPermission
FROM rantPostit AS t1
LEFT JOIN (rantPostitInfo AS t2) 
    ON ( t1.id = t2.referencePostitID)
LEFT OUTER JOIN (rantUser as t3, rantUserInfo as t4) 
    ON (t2.authorId = t3.id AND t4.referenceUserId = t2.authorId)
LEFT OUTER JOIN (rantCommentInfo as t5, rantUser as t6) 
    ON (t5.referenceType = 8 AND t5.referenceID = t1.id AND t6.id = t5.authorID)
ORDER BY t2.create DESC, t5.create DESC

Now, this returns the topic posts. Say I have two of them, it returns both of them fine. Say I have eight replies to the first, it will return 9 entries (one each for the topic + reply, and the individual one with no replies). So, I guess my issue is this: I don't know what to do to limit the number of returns in the final LEFT OUTER JOIN clause to just the most recent, or simply strike the least recent ones out of the window.

(Yes, I realize the ORDER BY... clause is messed up, as it'll first order it by the post create date, then by the comment create date. Yes, I realize I could simplify all my problems by adding two fields into postitInfo, lastCommentCreate and lastCommentCreateID, and have it update each time a reply is made, but... I like the hard way.)

So what am I doing wrong?

Or is this such an inane problem that I should be taken 'round the woodshed and beat with a hammer?

A: 

It looks like the last LEFT JOIN is the only one that can return multiple rows. If that's true, you can just use LIMIT 5 to get the last five comments:

 ORDER BY t5.create DESC
 LIMIT 5

If not, a very simple solution would be to retrieve the comments with a separate query:

SELECT *
FROM rantCommentInfo t5
    ON t5.referenceType = 8 
    AND t5.referenceid = t1.id
LEFT OUTER JOIN rantUser t6
    ON t6.id = t5.authorID
ORDER BY CommentCreate
WHERE t5.referenceid = YourT1Id
LIMIT 5

Can't think of a way to do it in one query, without ROW_NUMBER, which MySQL does not support.

Andomar
I apologize, then, but this is to display an index of the topics with their last replies (so the default limit is Limit 0, 25). This idea has merit, and possibly the closest thing to a solution I have thus far found:Run a separate query, add those results into the array that contain the original information, then use some clever sorting tricks before they get displayed.
Nikolai Echternacht
+1  A: 

The splits between post and postInfo, and the user and userInfo tables, appear to be doing nothing much here except obfuscate things. To better see solutions, let's boil things down to their essence: a table Posts (with a primary key id, a creation date date, and other fields) and a table Comments (with a primary key id, a foreign key refId referencing Posts, a unique creation date date, and other fields); we want to see all posts, each with its most recent comment if any (the primary keys id of the table rows retrieved, and the other fields, can of course be contextually used in the SELECT to fetch and show more info yet, but that doesn't change the core structure, and simplifying things down to the core structure should help illustrate the solutions). I'm assuming the creation date of a comment is unique, otherwise "latest comment" can be ambiguous (of course, that ambiguity could be arbitrarily truncated in other ways, picking one item of the set of "latest comments" to a given post).

So, here's one approach:

SELECT Posts.id, Comments.id FROM Posts
LEFT OUTER JOIN Comments on (Posts.id = Comments.refId)
WHERE Comments.create IS NULL OR (
      Comments.create = (SELECT create FROM Comments
                         WHERE refID = Posts.id
                         ORDER BY create DESC
                         LIMIT 1)
) /* add ORDER BY &c to taste;-) */

the idea: for each post, we want "a null comment" (when there have been no comment to it) or else the comment whose create date is the highest among those referencing the post; here, the inner SELECT takes care of finding that "highest" create date. So, in the same spirit, the inner select might be SELECT MAX(create) FROM Comments WHERE refID = Posts.id which is probably preferable (as shorter and more direct, & maybe faster).

Alex Martelli
This will give you the top 1 comment, but he's looking for the top 5?
Andomar
No, I just need the top one. I'm using five tables to make it possible (though with his, I'll be testing with just 3, though his presents 2. Not too hard to add in an additional LEFT OUTER JOIN for the create date.
Nikolai Echternacht
@Nikolai Echternacht: Aah ok I misread the question then. This answer will work for the top 1 comment. I'll vote it up :P
Andomar
If you needed "top 5", then you'd use the "main form" I posted (but with LIMIT 5) and IN instead of = to the left of the nested SELECT; i.e., top 1 or top 5, the changes are minor.
Alex Martelli
@Alex Martelli: MySQL does not support LIMIT in an IN subquery (try it.) You can work around that with a sub-subquery, but then you are not allowed to refer back to the Posts table.
Andomar
@Nikolai Echternacht, I hope I explained profusely in the answer why I'm showing just 2 tables -- the split in your posts table doesn't do anything, and the joining with the users table (also needlessly split) to get info about authors of posts and/or comments can be easily glued back on to the "solution core" I posted.
Alex Martelli
@Andomar: It's alright; I might've been too dense, or too verbose, with the description of the problem.@Alex: I successfully minimzed the entire thing down to using two tables (the postitInfo and commentInfo, which contains the ids to both plus a unique create date), and so far so good. I will now be using it as a framework to reconstruct the elongated, five table post. The most I can say is, you sir, are a genius.
Nikolai Echternacht
@Andomar: ouch, you're right, and it keeps tripping me up (as I move among different flavors of SQL with different dialects, sigh).@Nikolai Echternacht, thanks for the kind words and glad to have been of assistance (though I still don't understand why you want so many tables in the first place?-)
Alex Martelli