views:

52

answers:

4

I couldn't frame the question's title properly. Suppose a table of weekly movie earnings as below:

MovieName  <Varchar(450)>
MovieGross <Decimal(18)>
WeekofYear <Integer>
Year       <Integer>

So how do I get the names of top grossers for each week of this year, if I do:

    select MovieName , Max(MovieGross) , WeekofYear 
from earnings where year = 2010 group by WeekofYear;

Then obviously the query wont run, while

    select Max(MovieName) , Max(MovieGross) , WeekofYear 
from earnings where year = 2010 group by WeekofYear;

would just give movies starting with lowest alphabet. Is using group_concat() and then substring_index() the only option here?

    select 
       substring_index(group_concat(MovieName order by MovieGross desc),',',1),
       Max(MovieGross) , WeekofYear from earnings where year = 2010
    group by WeekofYear ;

Seems clumsy. Is there any better way of achieving this?

A: 

You need to determine the max weekly gross and then select the movie name based on that criterion. Something like this:

SELECT e.MovieName, m.Gross, m.WeekofYear
FROM  earnings e JOIN
  (SELECT MAX(MovieGross) Gross, WeekofYear  
    FROM earnings WHERE `year` = 2010 GROUP BY WeekofYear) m
ON e.MovieGross=m.Gross AND e.WeekofYear=m.WeekofYear
dnagirl
+2  A: 

It's the ever-recurring max-per-group problem. You solve it by selecting the defining properties of your group and then joining your "real" data against that.

select 
  e.MovieName, 
  e.MovieGross,
  e.WeekofYear 
from 
  earnings e
  inner join (
      select Max(MovieGross) MovieGross, Year, WeekofYear
        from earnings
    group by Year, WeekofYear
  ) max on max.Year       = e.Year 
       and max.WeekofYear = e.WeekofYear 
       and max.MovieGross = e.MovieGross
where
  e.year = 2010

The defining properties of your group are Year, WeekofYear and MAX(MovieGross). There will be one row with different values for each group range.

An INNER JOIN against your data table elimitates all rows that do not fulfill the defining properties of your group. This also means that it lets through all rows that do - you could end up with two movies that made the same amount of money in any particular week. Group the "outer" query again to eliminate the duplicate rows in favor of a single movie.

Tomalak
A: 

Ok, trying again with the having clause, I cannot help myself.

Hopefully this will help you get started. First, create a list of the weeks of the year, then for the inner query, find the one that has the max for that week.

Select MovieName, max(MovieGross) as max_gross, WeekofYear 
from earnings 
where year = 2010 
order by MovieGross desc
Having MovieGross=max_gross
group by WeekofYear

This should return the top grossing movie for each week. This should also return multiple entries for a week in the event of a tie.

Jacob

TheJacobTaylor
this query is consistently slower than the the group_concat / substring_index approach (1.1 sec vs 1.5 sec) on a table with ~250,000 rows matched by the where clause.. but it does look a ton neater
Gala101
Thank you for testing it. Do you happen to have the explain plan for the query and a show create table for the table? Depending on the indexes, I would expect it to be really fast.
TheJacobTaylor
A: 

Hello,

This is pretty fast query, that does the job:

SELECT e.WeekofYear as WeekofYear
, max(MovieGross) as MovieGross
, (SELECT MovieName FROM earnings
WHERE WeekofYear=e.WeekofYear ORDER BY MovieGross DESC LIMIT 1
) as MovieName
FROM earnings AS e
WHERE year='2010'
GROUP BY WeekofYear
ORDER BY WeekofYear;

Happy to help you :)

P.S. and thanks for ratings ;)

Sergej Jevsejev