



I have a table with 2 columns, date and score. It has at most 30 entries, for each of the last 30 days one.

date      score
1.8.2010  19
2.8.2010  21
4.8.2010  14
7.8.2010  10
10.8.2010 14

My problem is that some dates are missing - I want to see:

date      score
1.8.2010  19
2.8.2010  21
3.8.2010  0
4.8.2010  14
5.8.2010  0
6.8.2010  0
7.8.2010  10

What I need from the single query is to get: 19,21,9,14,0,0,10,0,0,14... That means that the missing dates are filled with 0.

I know how to get all the values and in server side language iterating through dates and missing the blanks. But is this possible to do in mysql, so that I sort the result by date and get the missing pieces...



EDIT: In this table there is another column named UserID, so I have 30.000 users and some of them have the score in this table. I delete the dates every day if date < 30 days ago because I need last 30 days score for each user. The reason is I am making a graph of the user activity over the last 30 days and to plot a chart I need the 30 values separated by comma. So I can say in query get me the USERID=10203 activity and the query would get me the 30 scores, one for each of the last 30 days. I hope I am more clear now.

+2  A: 

MySQL doesn't have recursive functionality, so you're left with using the NUMBERS table trick -

  1. Create a table that only holds incrementing numbers - easy to do using an auto_increment:

    DROP TABLE IF EXISTS `example`.`numbers`;
    CREATE TABLE  `example`.`numbers` (
      `id` int(10) unsigned NOT NULL auto_increment,
       PRIMARY KEY  (`id`)
  2. Populate the table using:


    ...for as many values as you need.

  3. Use DATE_ADD to construct a list of dates, increasing the days based on the value. Replace "2010-06-06" and "2010-06-14" with your respective start and end dates (but use the same format, YYYY-MM-DD) -

    SELECT x.*
      FROM (SELECT DATE_ADD('2010-06-06', INTERVAL - 1 DAY)
              FROM numbers n
             WHERE DATE_ADD('2010-06-06', INTERVAL -1 DAY) <= '2010-06-14 ) x
  4. LEFT JOIN onto your table of data based on the time portion:

       SELECT x.ts AS timestamp,
              COALESCE(y.score, 0) AS cnt
         FROM (SELECT DATE_FORMAT(DATE_ADD('2010-06-06', INTERVAL - 1 DAY), '%m/%d/%Y') AS ts
                 FROM numbers n
                WHERE DATE_ADD('2010-06-06', INTERVAL - 1 DAY) <= '2010-06-14') x
    LEFT JOIN TABLE y ON STR_TO_DATE(, '%d.%m.%Y') = x.ts

If you want to maintain the date format, use the DATE_FORMAT function:

DATE_FORMAT(x.ts, '%d.%m.%Y') AS timestamp
OMG Ponies
Thank you. Is this a fast operation of would you advise against using such approach and going server side calculation?
@Jerry2: My preference is to do as much data processing in the database, short of really involved presentation stuff. I don't envy doing this in application code, just as long as it's *one trip to the database*...
OMG Ponies
+1  A: 

You can accomplish this by using a Calendar Table. That's a table which you create once and fill with a date range (e.g. one dataset for each day 2000-2050; that depends on your data). Then you can make an outer join of your table against the calendar table. If a date is missing in your table, you return 0 for the score.

You can take a look here for examples: Why should I consider using an auxiliary calendar table?

True, but a numbers table is more flexible - see my answer for an example. IE: what if now you need sequential numbers too? Do you want a table per data type?
OMG Ponies
Needing sequential numbers would be another use case ;-) If you have to target different DBMS (i.e. Oracle, MySQL, SQL-Server) your approach would need a slightly modified statement, and I suspect that the DATE_ADD approach is slower than a calendar table (but I think that's not relevant here)
On [SQL Server 2005, a recursive CTE was *barely* faster than the NUMBERS table trick](
OMG Ponies