views:

24

answers:

1

I have a jobs table, and am trying to get a count of jobs for different time frames. My current query looks like this:

SELECT COUNT(*) AS 'count', 
       WEEK(j.created_at) AS 'week',
       MONTH(j.created_at) AS 'month', 
       YEAR(j.created_at) AS 'year', 
       DATE_FORMAT(j.created_at, '%y') AS 'short_year'
FROM jobs j WHERE j.state <> 'draft' 
            AND created_at > '2010-06-21'
            AND created_at < '2010-08-01'
GROUP BY WEEK(j.created_at)
ORDER BY WEEK(j.created_at)

To change my timeframe, I simply change the GROUP BY from WEEK to MONTH, and I get counts by month instead of week.

The problem is that I am not getting empty rows for weeks with 0 jobs. My result set from the query above is:

count week  month  year short_year
    3   25      6  2010         10
    2   26      6  2010         10
    2   27      7  2010         10 
    1   28      7  2010         10
    3   30      7  2010         10

You'll notice that there is no data for week 29, which should be a row with count(0). Is there any way to get that 0 count row, while maintaining the flexibility of changing my grouping between WEEK and MONTH?

+1  A: 

Create an auxiliary calendar table containing every date and outer join onto that (the below may need tweaking if created_at has a time component)

SELECT COUNT(*) AS 'count', 
       WEEK(c.date) AS 'week',
       MONTH(c.date) AS 'month', 
       YEAR(c.date) AS 'year', 
       DATE_FORMAT(c.date, '%y') AS 'short_year'
FROM calendar c
LEFT OUTER JOIN jobs j ON j.created_at = c.date
            AND j.state <> 'draft' 
WHERE c.date > '2010-06-21'
            AND c.date < '2010-08-01'
GROUP BY WEEK(c.date)
ORDER BY WEEK(c.date)
Martin Smith
We don't have a calendar table with data like that, and creating one would require significant changes to the data structure of the app that I don't have time to implement. If I change all your `c.date` to `c.created_at`, and use `jobs` instead of `calendar`, it returns the same dataset as my original query.
Mike Trpcic
You need to **create** the auxiliary calendar table first.
Martin Smith
See my modified comment above. That would rerquire a significant amount of changes to the Rails application serving this data, and I don't have time to implement those changes.
Mike Trpcic
Well there's nothing in `SQL` that will help you add in the missing rows if they are not in the data.
Martin Smith
I may have misunderstood your answer. Your calendar table has a record for every day, not just the days that jobs were created, correct?
Mike Trpcic
@Mike - Yes. One record for each day which might feasibly ever be included in the query.
Martin Smith
I've implemented the calendar, but there is still a problem. I'm now getting a result that says there are 7 results for the week of 29, when the count was previously 0. I'm using your exact query from above.
Mike Trpcic
@Mike - Ah use `COUNT(j.created_at)` instead of `*`
Martin Smith
Doing that returns a count of 0 for ALL rows. Oh the humanity.
Mike Trpcic
Does `created_at` have a time component meaning that the exact equality join won't work?
Martin Smith
Indeed it does. Is there a MySQL function that can fix this issue?
Mike Trpcic
Ahh, the DATE() function! Thanks for all the help and time, it appears to be working now!
Mike Trpcic
You might find it improves performance to use `j.created_at >= c.date and j.created_at < DATE_ADD(c.date, INTERVAL 1 DAY);` Alternatively it might not!
Martin Smith