tags:

views:

714

answers:

4

Putting this as simply as I can, I have the following table structure:

Date | Type | Title

Say Type is a value in the range 1-10, I have 1,000s of records in the table, and I want the 5 most recent records of unique type. So the result would be something like:

2009-06-04 14:32:00 | 4 | Zeppo

2009-06-04 14:31:00 | 2 | Groucho

2009-06-04 14:30:00 | 8 | Harpo

2009-06-04 14:29:00 | 5 | Gummo

2009-06-04 14:28:00 | 3 | Chico

Seems like I either want DISTINCT to only apply to the Type column, or I want a GROUP BY which will apply after an ORDER BY clause.

All in MySQL 4.

+1  A: 

Didn't test in for MySQL 4, but in MySQL 5 this can be easily done.

You'll need to have some kind of a PRIMARY KEY in your table for this to work.

SELECT  l.*
FROM    (
        SELECT  type,
                COALESCE(
                (
                SELECT  id
                FROM    mytable li
                WHERE   li.type= dlo.type
                ORDER BY
                        li.type DESC, li.date DESC, li.id DESC
                LIMIT 4, 1
                ), CAST(0xFFFFFFFF AS DECIMAL)) AS mid
                COALESCE(
                (
                SELECT  date
                FROM    mytable li
                WHERE   li.type= dlo.type
                ORDER BY
                        li.type DESC, li.date DESC, li.id DESC
                LIMIT 4, 1
                ), '9999-31-12') AS mdate
        FROM    (
                SELECT  DISTINCT type
                FROM    t_mytable dl
                ) dlo
        ) lo, t_mytable l
WHERE   l.type >= lo.type
        AND l.type  <= lo.type
        AND (l.date, l.id) >= (lo.mdate, lo.mid)

See this entry in my blog for more detail on how it works:

If you cannot add a PRIMARY KEY to implement this solution, you may try using less efficient one using system variables:

SELECT  l.*
FROM    (
        SELECT  @lim := 5,
                @cg := -1
        ) vars,
        mytable l
WHERE   CASE WHEN @cg <> type THEN @r := @lim ELSE 1 END > 0
        AND (@r := @r - 1) >= 0
        AND (@cg := type) IS NOT NULL
ORDER BY
        type DESC, date DESC

It's described here:

Update:

If you don't want to select 5 records for each type (which would give 5 x number of types records in the resultset), but instead want to select 5 latest records with distinct type (which would give 5 records in the resultset), use this query:

SELECT  date, type, title
FROM    mytable m
WHERE   NOT EXISTS
        (
        SELECT  1
        FROM    mytable mi
        WHERE   mi.date > m.date
                AND mi.type = m.type
        )
ORDER BY
        date DESC
LIMIT 5

If you have lots of types, this will be more efficient that using GROUP BY, provided you have an index on date.

Quassnoi
Great articles. I do think you can just tell MySQL which index to use however instead of your trick at the bottom of Advanced Row Sampling:http://dev.mysql.com/doc/refman/5.1/en/index-hints.html
Adam Nelson
@Adam: the index will be used, but only for coarse filtering on the first column along with USING WHERE on the second one. Trust me, every row of that query is on purpose :)
Quassnoi
+1  A: 

Assuming query performance is somewhat important and you're stuck with MySQL 4 I would go with something like so:

SELECT Date, Type, Title
FROM (
    SELECT Date, Type, Title
    FROM Table
    WHERE Type = 1
    ORDER BY Date
    LIMIT 1

    UNION ALL

    SELECT Date, Type, Title
    FROM Table
    WHERE Type = 2
    ORDER BY Date
    LIMIT 1

    ...
) x
ORDER BY Date
LIMIT 5

It's not pretty, but it should get the job done and quickly. If you're regularly adding new values to the types column, this may not be for you as it would require modifying the query regularly.

I think derived tables were added in 4.1, so you'll need to be at least that high in the 4 series. If you were on version 5 or a different database, there would be nicer ways to approach this.

EvilRyry
It IS 4.1, so I might look into derived tables.Unfortunately, the type value will be changing frequently enough to make this approach a bit of a pain maintenance-wise, but it's an interesting approach - thanks.
Bobby Jack
+1  A: 

This appears to be a good case for the union statement and nested selects.

Something along the lines of:

select t.date, t.type, t.title from (Select top 5 table.Date, table.Type, table.Title from table where type = 1 order by table.date desc) as t

union

select t2.date, t2.type, t2.title from (Select top 5 table.Date, table.Type, table.Title from table where type = 2 order by table.date desc) as t2

[...]

select t10.date, t10.type, t10.title from (Select top 5 table.Date, table.Type, table.Title from table where type = 10 order by table.date desc) as t10

order by type, date

Unfortunately, I only have sql server 2008 to test this on here, so ymmv; but this is pretty basic stuff and it ought to work as both nested selects and unions are supported in mysql 4.0 according to the reference guide. (I've added the as clause to the nested from statement based on the mysql documentation.)

Cynthia
+3  A: 

Am I missing something? The easy solution seems to be:

SELECT  MAX(date) AS max_date, type, title
FROM    table
GROUP BY
        type  
ORDER BY
        max_date DESC
LIMIT 5

And it should be extremely fast.

Adam Nelson
@Adam: I think your solution it what author really _means_, but it does not return "5 most recent records of each type" as he _says_.
Quassnoi
@Adam: but I think it's a +1 anyway.
Quassnoi