tags:

views:

123

answers:

7

So I don't consider myself a novice at MySQL but this one has me stumped:

I have a message board and I want to pull a list of all the most recent posts grouped by the Thread ID.

Here's the table:

MB_Posts
-ID
-Thread_ID
-Created_On (timestamp)
-Creator_User (user_id)
-Subject
-Contents
-Edited (timestamp)
-Reported

I've tried many different things to keep it simple but I would like help from the community on this one.

Just to kick this out there...this one does not work as expected:

SELECT * FROM MB_Posts GROUP BY Thread_ID ORDER BY ID DESC

Desired results:

A set of the most recent posts, one per thread ID

A: 

When you use GROUP BY, you have to have an aggregate function in the SELECT part of the statement.

mkj
I've been able to user GROUP BY and return the results I need, but please explain how I would implement an aggregate function in the SELECT portion and return the desired result set.
jerebear
A: 

Ok, I have a suggestion. Let me know:

SELECT 
    MB_Posts.*, 
    (SELECT 
        GROUP_CONCAT(CreatedOn ORDER BY CreatedOn DESC) 
        FROM MB_Posts AS P 
        WHERE P.Thread_ID = MB_Posts.Thread_ID) 
    AS `sort`  
FROM MB_Posts 
ORDER BY `sort` DESC, CreatedOn DESC 
Jage
That's if I wanted to select the single most recent post from a single thread. That's covered. What I want is the most recent post for EVERY thread as a result from a single query.
jerebear
No worries. Thanks for the feedback.
jerebear
I edited my answer with something that is better, I think.
Jage
This returned all the results, sorry.
jerebear
That's because you need to add a limit, right after the order by clause.
Jage
A: 

Here is another way:

SELECT * FROM mb_posts ORDER BY ID DESC

threadsHistory = array();
while (row)
{
    if (row) NOT IN threadsHistory
    {
         add values to array (for example thread_ID and postID) 
         count++;
    }
    if count == 10 break;
}

this is not exactly a nice and clean way, but it should work with a little twisting

Spiros
Tried it. Tried many things...having a hard time with the actual syntax. I've tried JOINS, Nexted Queries and all sorts of aggregators. This is just a tricky problem.
jerebear
Indeed, I was just trying on one of my databases to find a solution but no success. You can always use a simple code loop if you don't mind the overhead.
Spiros
I edited the answer, this is a way that works for sure but it's not the nicest and cleanest way to go I guess.
Spiros
This one I couldn't get to work at all.
jerebear
A: 

GROUP BY expects to return one row per unique group by, e.g.

SELECT COUNT(*) FROM mb_post GROUP BY thread_id;

Will return the number of posts in each thread.

You will need to do an order by thread_id, id and then programmatically handle the "grouping" by that you want, e.g.

for (row in rows) {
    print("<tr><th>");
    if (row["thread_id"] != prev_id) {
      print(thread_id)
      prev_id = thread_id;
    }
    print("</th>")
vickirk
I understand. But GROUP BY will return the FIRST instance of an element according to the group by and will ignore an ORDER BY statement at the end. That approach seemed the simplest but it won't work.
jerebear
Yes, this is why you will need to change the SELECT query. SELECT * FROM mb_post ORDER BY Created_On DESC will select all posts sorted by date of creation in a desc order, then you can apply the above code.
Spiros
Damn... no it wont work... you will also need to change the check for already displayed threads. Maybe by adding a value to an array each time you encounter a new one, and when the array reaches the size of X you can stop the loop.
Spiros
@Spiros: I forgot about the "recent posts" part, if the op uses the query with a subquery as suggested by Kevin and then applies the above logic he should get what he wants easily
vickirk
A: 

Give this a whirl and see if it fits what you want. This should show you the most recent 100 posts and the results will come back sorted by the thread that they belong to.

SELECT MB_Posts.* 
  FROM MB_Posts 
  INNER JOIN (
    SELECT ID FROM MB_Posts ORDER BY Created_On DESC LIMIT 0, 100
  ) as MostRecentPosts ON MB_Posts.ID = MostRecentPosts.ID
  ORDER BY Thread_ID, Created_On DESC;

Based on your revision "A set of the most recent posts, one per thread ID", the table created from the subquery can be created using a little concat and compare magic. This will depend on what your column types are but should work for most. The idea of both of these solutions is to use a subquery to handle your aggregation process and determine what the ID's are of the desired records, and then to join the table against the results of that subquery to retrieve the rest of the data for the row.

SELECT MB_Posts.* 
  FROM MB_Posts 
  INNER JOIN (
    SELECT TRIM(LEADING '0' FROM
      SUBSTRING(MAX(CONCAT(UNIX_TIMESTAMP(Created_On), LPAD(ID,15,'0'))), -15)) AS ID
    FROM MB_Posts 
    GROUP BY Thread_ID
  ) as MostRecentPostPerThread ON MB_Posts.ID = MostRecentPostPerThread.ID
  ORDER BY Thread_ID, Created_On DESC;
Kevin
Tried it. It's returning every post in the table. :(
jerebear
What version of mysql are you using maybe? I've just retested this locally to verify that it correctly returns the most recent 100 entries.
Kevin
Based on your revised requirements, had you removed the limit 0,100 in the first answer? That would definitely have given you back every post. I've added a solution based on your revised requirements that allows for grouping by one field, sorting by another, and grabbing the associated id by piggy backing the id on to the sort field and then stripping it back off after the group sort (max) operation.
Kevin
+3  A: 

GROUP BY overrides ORDER BY (with no exceptions).

SELECT * 
FROM MB_Posts 
WHERE ID IN (SELECT MAX(ID) FROM MB_Posts GROUP BY Thread_ID)

should give you what you need.

For details read about hidden columns in group by (in your original query all your columns except Thread_ID are hidden)

Unreason
Hot damn! That worked! Thank you Unreason.
jerebear
A: 

Hmm... Does Created_On or Edited determine "most recent?"

Presuming Created_On is what's wanted:

SELECT Thread_ID, MAX(Created_On) FROM MB_Posts GROUP BY Thread_ID ;

gives the latest timestamp per Thread_ID. Semijoin these to MB_Posts with something like

SELECT * FROM MB_Posts JOIN ( SELECT Thread_ID, MAX(Created_On) AS Created_On FROM MB_Posts GROUP BY Thread_ID ) LATEST USING (Thread_ID, Created_On) ;

JayDee
This one also worked! Thanks JayDee
jerebear