tags:

views:

107

answers:

2

I am trying to create SQL for retrieveing a list of latests posts for the forum thread. I have the following code:

SELECT
 item_discuss_thread_id
 , item_discuss_post_title
 , COUNT(item_discuss_thread_id) AS nb_posts
FROM
 item_discuss_posts
GROUP BY
 item_discuss_thread_id

Obviously this will group without the respect of if the post is latest or not. item_discuss_post_title will just get the first row in the group.

I wonder if there's some way around this? If not, what is the best way to solve the problem... only subqueries?

Thanks, Pavel

UPDATE: Please note that I need all threads, LIMIT 1 is not solving the problem. Also ORDER BY is not an option as GROUP BY will select the first record from group anyway. This is not such a simple question as it can seem to be.

UPDATE:

I really want to try to avoid using subqueries or if doing so - use it the optimal may. What I came with currently is something like this:

SELECT
  ordered_by_date.item_discuss_thread_id
  , item_discuss_post_title
  , COUNT(item_discuss_thread_id) AS nb_posts
FROM
  (
   SELECT
     item_discuss_thread_id
     , item_discuss_post_title
   FROM
      item_discuss_posts
   ORDER BY
      item_discuss_post_datetime DESC
  ) AS ordered_by_date
GROUP BY
   item_discuss_thread_id

EXPLAIN EXTENDED gives the following result:

id, select_type, table, type, possible_keys, key, key_len, ref, rows, Extra
1, PRIMARY, <derived2>, ALL, \N, \N, \N, \N, 20, Using temporary; Using filesort
2, DERIVED, item_discuss_posts, index, \N, item_discuss_post_datetime, 8, \N, 20,
+1  A: 

Try this.

SELECT
  *
FROM
  (SELECT item_discuss_thread_id, item_discuss_post_title, COUNT(item_discuss_thread_id) AS nb_posts
   FROM item_discuss_posts
   ORDER BY __datecolumn__)
  AS ordered_by_date
GROUP BY
  ordered_by_date.item_discuss_thread_id

Replace __datecolumn__ with the column that stores posting time.

Dan Loewenherz
Thank you. This works, though I believe this is rather slow, I'm searching for the optimal solition. I'd like to hear more ideas from people over here - if there are none, I think I will stick with something like this one.
Pavel Dubinin
Can you append the query with `EXPLAIN EXTENDED` and paste the results either in another comment or in the question prompt? You might want to create indices if you haven't done so.
Dan Loewenherz
The problem is not with indices, but with a query itself. MySQL will need to go through all posts to retrieve ordered by date first and then go though all of it again to group it.Also, as per SQL standard all columns that are not aggregates need to be included in the group by statement or otherwise the result is unpredictable. This SQL request works on the current MySQL version but noone guarantees that it will work further on newer versions.
Pavel Dubinin
+1  A: 

Ok, I came with solution myself. I used a dependent subquery to solve. This is what I have in the result:

     SELECT
       item_discuss_threads.item_discuss_thread_id
       , item_discuss_threads.item_discuss_thread_datetime
    , item_discuss_threads.item_discuss_thread_title
       , latest_posts.item_discuss_post_title
       , latest_posts.item_discuss_post_datetime
       , COUNT(item_discuss_posts.item_discuss_post_id) AS nb_posts
      FROM
       item_discuss_threads
      INNER JOIN item_discuss_posts
       ON item_discuss_threads.item_discuss_thread_id=item_discuss_posts.item_discuss_thread_id
      INNER JOIN item_discuss_posts AS latest_posts
       ON latest_posts.item_discuss_thread_id=item_discuss_threads.item_discuss_thread_id
      WHERE
       (
          SELECT
              item_discuss_post_id
          FROM
              item_discuss_posts AS p
          WHERE
              p.item_discuss_thread_id=item_discuss_posts.item_discuss_thread_id
          ORDER BY
              item_discuss_post_datetime DESC
          LIMIT
             1
       )=latest_posts.item_discuss_post_id
    GROUP BY
       item_discuss_threads.item_discuss_thread_id
  ORDER BY
   latest_posts.item_discuss_post_datetime DESC
Pavel Dubinin