views:

55

answers:

4

Hi,

Given the following table format...

id | date       | value
___|____________|______
11 | 2010-01-01 | 50
11 | 2010-01-02 | 100
12 | 2010-01-01 | 150
12 | 2010-01-02 | 200

... I need to select the id that corresponds to the maximum value on a day that I specify. The only way I've figured out how to do this so far is using a sub-query as follows:

SELECT id
    FROM table
    WHERE date =  '2010-01-01'
        AND value = ( 
            SELECT MAX(value)
                FROM table
                WHERE date = '2010-01-01'
                GROUP BY date
        )

On a table with ~70,000 records, with a primary key over id and date, this takes ~0.25 seconds to execute, which seems a long time to me. Is there a faster way for me to achieve the same result?

Thanks.

+1  A: 

MySQL had grown pretty lenient on what you get back from a GROUP BY query - meaning it doesn't have to be just aggregate or GROUP BY columns. You should be fine getting the PRIMARY KEY See what the following gives you.

SELECT id, MAX(value) FROM table WHERE date = '2010-01-01' GROUP BY date;
Jason McCreary
Thanks for the help. Unfortunately this doesn't execute as `value` is used in the `HAVING` clause but isn't being `SELECT`ed. And if I add `value` into the `SELECT`, I get an empty set back.
edanfalls
Try the updated version.
Jason McCreary
+2  A: 
SELECT TOP 1 id FROM table WHERE date = '2010-01-01' ORDER BY value DESC
Adam Robinson
Thanks. A quick search showed me that MySQL uses `LIMIT` instead of `TOP`, which works - my fault for not specifying MySQL in the question though. This is certainly a lot cleaner but it still takes ~0.20s to execute. So I'm thinking this is maybe more a table/indexing issue.
edanfalls
@edan: Sorry, yes, `LIMIT`. This is the fastest solution that I'm aware of, so all that's left is index tuning.
Adam Robinson
This is not good if you're trying to find a complete set of IDs that have equal values for that day. Example: you want to award salesmen with the highest sales for that day, if two tie, you won't know and could have a lawsuit on your hands.
vol7ron
+1  A: 

tried using a JOIN?

SELECT 
     A.id 
FROM 
     table A
     JOIN (SELECT MAX(value) as m_value
        FROM table 
        WHERE date = '2010-01-01') AS B ON A.value = B.m_value
WHERE 
     A.date =  '2010-01-01'
potatopeelings
just to add - this will be a good option if you have multiple ids with the MAX value. if you are sure you'll have only id with the MAX value it will be better to use the TOP (as in Adam Robinson's answer)
potatopeelings
Thanks, I hadn't thought to try a join as I thought it wasn't worth it for a single execution, but this also takes ~0.25s which is quicker than I was expecting :)There will be multiple ids with the same MAX(value) as well, but I haven't decided whether this is important yet. So I may need to use this method :)
edanfalls
this should have the same effect as value = (select max(value) ...) -- the query planner will shit on this answer.
vol7ron
@edanfalls - just edited to remove the group by (it didn't hurt, but it's not required). @vol7ron - i think it would depend on the MySQL version (the new version converts subqueries to joins as if they are not correlated - http://datacharmer.blogspot.com/2008/09/drizzling-mysql.html, so the subquery will be same as a join)
potatopeelings
**@potatopeelings:** I think `the new version converts subqueries to joins as if they are not correlated` is what I was saying. For this query, an uncorrelated join/subquery will have the same query plan and are really as fast as it's going to get.
vol7ron
+1  A: 

Collection of Answers


Type the date once:

SELECT id
FROM   table
WHERE  (date,value) IN ( select date, max(value) 
                         from   table
                         where  date = '2010-01-01'
                         group by date
                       )


Date twice, no Group By (preferred method):

SELECT id
FROM   table
WHERE  date  = '2010-01-01'
  AND  value = ( select max(value)
                  from   table
                  where  date = '2010-01-01'
                )


Thoughts


  • The second should be the fastest.
  • Any query that performs a join in the FROM statement is equivalent to a subquery in the WHERE clause.
  • Any query that uses LIMIT/TOP 1 may not be returning the full resultset, which could negatively impact your application, based on your requirments - you may want all IDs
  • Other ways to improve speed:
    1. Create stored procedure
    2. Create an index on (date,value) ID shouldn't need an index in this case
vol7ron