views:

67

answers:

2

I have a table with 2 columns, date and score. It has at most 30 entries, for each of the last 30 days one.

date      score
-----------------
1.8.2010  19
2.8.2010  21
4.8.2010  14
7.8.2010  10
10.8.2010 14

My problem is that some dates are missing - I want to see:

date      score
-----------------
1.8.2010  19
2.8.2010  21
3.8.2010  0
4.8.2010  14
5.8.2010  0
6.8.2010  0
7.8.2010  10
...

What I need from the single query is to get: 19,21,9,14,0,0,10,0,0,14... That means that the missing dates are filled with 0.

I know how to get all the values and in server side language iterating through dates and missing the blanks. But is this possible to do in mysql, so that I sort the result by date and get the missing pieces...

Thanx

Jerry

EDIT: In this table there is another column named UserID, so I have 30.000 users and some of them have the score in this table. I delete the dates every day if date < 30 days ago because I need last 30 days score for each user. The reason is I am making a graph of the user activity over the last 30 days and to plot a chart I need the 30 values separated by comma. So I can say in query get me the USERID=10203 activity and the query would get me the 30 scores, one for each of the last 30 days. I hope I am more clear now.

+2  A: 

MySQL doesn't have recursive functionality, so you're left with using the NUMBERS table trick -

  1. Create a table that only holds incrementing numbers - easy to do using an auto_increment:

    DROP TABLE IF EXISTS `example`.`numbers`;
    CREATE TABLE  `example`.`numbers` (
      `id` int(10) unsigned NOT NULL auto_increment,
       PRIMARY KEY  (`id`)
    ) ENGINE=InnoDB DEFAULT CHARSET=latin1;
    
  2. Populate the table using:

    INSERT INTO NUMBERS
      (id)
    VALUES
      (NULL)
    

    ...for as many values as you need.

  3. Use DATE_ADD to construct a list of dates, increasing the days based on the NUMBERS.id value. Replace "2010-06-06" and "2010-06-14" with your respective start and end dates (but use the same format, YYYY-MM-DD) -

    SELECT x.*
      FROM (SELECT DATE_ADD('2010-06-06', INTERVAL n.id - 1 DAY)
              FROM numbers n
             WHERE DATE_ADD('2010-06-06', INTERVAL n.id -1 DAY) <= '2010-06-14 ) x
    
  4. LEFT JOIN onto your table of data based on the time portion:

       SELECT x.ts AS timestamp,
              COALESCE(y.score, 0) AS cnt
         FROM (SELECT DATE_FORMAT(DATE_ADD('2010-06-06', INTERVAL n.id - 1 DAY), '%m/%d/%Y') AS ts
                 FROM numbers n
                WHERE DATE_ADD('2010-06-06', INTERVAL n.id - 1 DAY) <= '2010-06-14') x
    LEFT JOIN TABLE y ON STR_TO_DATE(y.date, '%d.%m.%Y') = x.ts
    

If you want to maintain the date format, use the DATE_FORMAT function:

DATE_FORMAT(x.ts, '%d.%m.%Y') AS timestamp
OMG Ponies
Thank you. Is this a fast operation of would you advise against using such approach and going server side calculation?
Jerry2
@Jerry2: My preference is to do as much data processing in the database, short of really involved presentation stuff. I don't envy doing this in application code, just as long as it's *one trip to the database*...
OMG Ponies
+1  A: 

You can accomplish this by using a Calendar Table. That's a table which you create once and fill with a date range (e.g. one dataset for each day 2000-2050; that depends on your data). Then you can make an outer join of your table against the calendar table. If a date is missing in your table, you return 0 for the score.

You can take a look here for examples: Why should I consider using an auxiliary calendar table?

Soundlink
True, but a numbers table is more flexible - see my answer for an example. IE: what if now you need sequential numbers too? Do you want a table per data type?
OMG Ponies
Needing sequential numbers would be another use case ;-) If you have to target different DBMS (i.e. Oracle, MySQL, SQL-Server) your approach would need a slightly modified statement, and I suspect that the DATE_ADD approach is slower than a calendar table (but I think that's not relevant here)
Soundlink
On [SQL Server 2005, a recursive CTE was *barely* faster than the NUMBERS table trick](http://stackoverflow.com/questions/1478951/tsql-generate-a-resultset-of-incrementing-dates/1479028#1479028)
OMG Ponies