views:

146

answers:

2

Hello,

Lets say I have 2 tables: blog_posts and categories. Each blog post belongs to only ONE category, so there is basically a foreign key between the 2 tables here.

I would like to retrieve the 2 lasts posts from each category, is it possible to achieve this in a single request? GROUP BY would group everything and leave me with only one row in each category. But I want 2 of them.

It would be easy to perform 1 + N query (N = number of category). First retrieve the categories. And then retrieve 2 posts from each category.

I believe it would also be quite easy to perform M queries (M = number of posts I want from each category). First query selects the first post for each category (with a group by). Second query retrieves the second post for each category. etc.

I'm just wondering if someone has a better solution for this. I don't really mind doing 1+N queries for that, but for curiosity and general SQL knowledge, it would be appreciated!

Thanks in advance to whom can help me with this.

+1  A: 
SELECT  p.*
FROM    (
        SELECT  id,
                COALESCE(
                (
                SELECT  datetime
                FROM    posts pi
                WHERE   pi.category = c.id
                ORDER BY
                        pi.category DESC, pi.datetime DESC, pi.id DESC
                LIMIT 1, 1
                ), '1900-01-01') AS post_datetime,
                COALESCE(
                (
                SELECT  id
                FROM    posts pi
                WHERE   pi.category = c.id
                ORDER BY
                        pi.category DESC, pi.datetime DESC, pi.id DESC
                LIMIT 1, 1
                ), 0) AS post_id
        FROM    category c
        ) q
JOIN    posts p
ON      p.category <= q.id
        AND p.category >= q.id
        AND p.datetime >= q.post_datetime
        AND (p.datetime, p.id) >= (q.post_datetime, q.post_id)

Make an index on posts (category, datetime, id) for this to be fast.

Note the p.category <= c.id AND p.category >= c.id hack: this makes MySQL to use Range checked for each record which is more index efficient.

See this article in my blog for a similar problem:

Quassnoi
Awesome, would this work even for categories where I have only 1 post? Wouldn't the limit excludes those posts?
Savageman
`@Savageman`: updated to fix it.
Quassnoi
This just get much complicated! I choosed the solution from John Feminella, which is much easier to understand and extend (if my M change).
Savageman
John's solution is nice and quite efficient if you have many categories. However, if you have few categories but many post in each category, this solutions is more efficient. If you have less than `1,000` posts, you won't even notice the difference.
Quassnoi
Ok, I'll dig into it for my personal knowledge when I'll have more time. Right now, I just see some strange magic, the second Coalesce looks like "(SELECT datetime) as post_id" and I'm a bit lost.
Savageman
`@Savageman`: it's was a very powerful magic before I changed it to `SELECT id` :)
Quassnoi
+1  A: 

Check out this MySQL article on how to work with the top N things in arbitrarily complex groupings; it's good stuff. You can try this:

SET @counter = 0;
SET @category = '';

SELECT
  *
FROM
(
  SELECT
    @counter := IF(posts.category = @category, @counter + 1, 0) AS counter,
    @category := posts.category,
    posts.*
    FROM
      (
      SELECT
        *
        FROM test
        ORDER BY category, date DESC
      ) posts
) posts
HAVING counter < 2
John Feminella
Well, that's another very good answer. Both of you deserves a up-vote. Thank you.Still thinking about who gonna win the accepted one.
Savageman
Works perfectly and looks easy to understand, thank you!
Savageman