tags:

views:

41

answers:

1

Not sure Looking for some guidance here...

I have the following tables:

site_text
id   site_id    text    date_added
1    1          ...     2010-02-02 10:01:00
2    1          ...     2010-02-02 10:01:20
3    2          ...     2010-02-02 10:01:00
4    1          ...     2010-02-02 10:01:40
5    2          ...     2010-02-02 10:02:00

sites
id   site_name
1    www.a.com
2    www.b.com
....

I'm trying to select the last 2 rows (ordered by date desc) in site_text for each site. Can you do this with one query? Something like, but have it only find 2 rows of each site?

SELECT * 
  FROM site_text st 
  JOIN sites s ON st.site_id = s.id 
 WHERE st.date_added > '2010-02-01 23:32:04' 
ORDER BY s.id, st.date_added DESC

EDIT: In the end, I would be looking for rows 4,2,5,3.

+2  A: 

There is a clever solution here with CTE's, and RANK() OVER DATE_ADDED in Pg 8.4 but because you're using mysql, the only thing I can think of of involves a very computationally complex query:

-- Select the top date for each site_id
SELECT DISTINCT id FROM site_text
    GROUP BY site_id, date_added
    ORDER BY site_id, date_added
UNION
-- Select the top date for each site_id
SELECT DISTINCT id FROM site_text
    GROUP BY site_id, date_added
        -- Excluding the top date.
    WHERE id NOT IN (
        SELECT DISTINCT id FROM site_text
        GROUP BY site_id, date_added
        ORDER BY site_id, date_added
    )
    ORDER BY site_id, date_added
;

Moral of the story, use Pg. There might be a more mysql-specific way to do this that yields a more acceptable performance profile.

Evan Carroll
Wow, that is expensive. Thanks for the tips. I have been meaning to check out PG for a while. Maybe this is the reason!
frio80
Boy are you in for a treat: it's like coming out of one tunnel of suck.
Evan Carroll
With a window function, just for reference, I *think* it would look like this `SELECT * FROM ( SELECT text, date_added, rank() OVER ( PARTITION BY site_id ORDER BY date_added desc ) AS t WHERE t.rank IN (1,2)`, I would expect it to run >100 times faster.
Evan Carroll