views:

5819

answers:

7

Using standard mysql functions is there a way to write a query that will return a list of days between two dates.

eg given 2009-01-01 and 2009-01-13 it would return a one column table with the values:

  • 2009-01-01
  • 2009-01-02
  • 2009-01-03
  • 2009-01-04
  • 2009-01-05
  • 2009-01-06
  • 2009-01-07
  • 2009-01-08
  • 2009-01-09
  • 2009-01-10
  • 2009-01-11
  • 2009-01-12
  • 2009-01-13

Edit: It appears I have not been clear. I want to GENERATE this list. I have values stored in the database (by datetime) but want them to be aggregated on a left outer join to a list of dates as above (I am expecting null from the right side of some of this join for some days and will handle this).

A: 

Create a stored procedure which takes two parameters a_begin and a_end. Create a temporary table within it called t, declare a variable d, assign a_begin to d, and run a WHILE loop INSERTing d into t and calling ADDDATE function to increment the value d. Finally SELECT * FROM t.

eed3si9n
Actually I believe a permanent table would be better for speed of execution (has minimal storage requirements). It's faster to pre-calculate the dates into a table and extract when needed, than to create ranges every time you need them.
paxdiablo
@Pax, it depends if the gained speed improvement is worth the table maintenance. For example, if the report generation takes 3 seconds and running WHILE loop to generate dates takes 0.001 seconds, I'd argue that the performance gain is ignorable. Knuth: Premature optimization is root of all evil.
eed3si9n
@eed3si9n, premature optimization, yes, not all optimization :-) If your example were correct then yes, I'd agree with you, but it should be measured for certainty. Table maint is gone if you generate it 1000 years out (a one time cost) as I suggested but those seconds will add up.
paxdiablo
A: 

You can use MySQL's user variables like this:

SET @num = -1;
SELECT DATE_ADD( '2009-01-01', interval @num := @num+1 day) AS date_sequence, 
your_table.* FROM your_table
WHERE your_table.other_column IS NOT NULL
HAVING DATE_ADD('2009-01-01', interval @num day) <= '2009-01-13'

@num is -1 because you add to it the first time you use it. Also, you can't use "HAVING date_sequence" because that makes the user variable increment twice for each row.

Andrew Vit
+4  A: 

We had a similar problem with BIRT reports in that we wanted to report on those days that had no data. Since there were no entries for those dates, the easiest solution for us was to create a simple table that stored all dates and use that to get ranges or join to get zero values for that date.

We have a job that runs every month to ensure that the table is populated 5 years out into the future. The table is created thus:

create table all_dates (
    dt date primary key
);

No doubt there are magical tricky ways to do this with different DBMS' but we always opt for the simplest solution. The storage requirements for the table are minimal and it makes the queries so much simpler and portable. This sort of solution is almost always better from a performance point-of-view since it doesn't require per-row calculations on the data.

The other option (and we've used this before) is to ensure there's an entry in the table for every date. We swept the table periodically and added zero entries for dates and/or times that didn't exist. This may not be an option in your case, it depends on the data stored.

If you really think it's a hassle to keep the all_dates table populated, a stored procedure is the way to go which will return a dataset containing those dates. This will almost certainly be slower since you have to calculate the range every time it's called rather than just pulling pre-calculated data from a table.

But, to be honest, you could populate the table out for 1000 years without any serious data storage problems - 365,000 16-byte (for example) dates plus an index duplicating the date plus 20% overhead for safety, I'd roughly estimate at about 14M [365,000 * 16 * 2 * 1.2 = 14,016,000 bytes]), a minuscule table in the scheme of things.

paxdiablo
A: 

Borrowing an idea from this answer, you can set up a table with 0 through 9 and use that to generate your list of dates.

CREATE TABLE num (i int);
INSERT INTO num (i) VALUES (0), (1), (2), (3), (4), (5), (6), (7), (8), (9);

select adddate('2009-01-01', numlist.id) as `date` from
(SELECT n1.i + n10.i*10 + n100.i*100 AS id
   FROM num n1 cross join num as n10 cross join num as n100) as numlist
where adddate('2009-01-01', numlist.id) <= '2009-01-13';

This will allow you to generate a list of up to 1000 dates. If you need to go larger, you can add another cross join to the inner query.

Logan5
+2  A: 

I would use this stored procedure to generate the intervals you need into the temp table named time_intervals, then JOIN and aggregate your data table with the temp time_intervals table.

The procedure can generate intervals of all the different types you see specified in it:

call make_intervals('2009-01-01 00:00:00','2009-01-10 00:00:00',1,'DAY')
.
select * from time_intervals  
.
interval_start      interval_end        
------------------- ------------------- 
2009-01-01 00:00:00 2009-01-01 23:59:59 
2009-01-02 00:00:00 2009-01-02 23:59:59 
2009-01-03 00:00:00 2009-01-03 23:59:59 
2009-01-04 00:00:00 2009-01-04 23:59:59 
2009-01-05 00:00:00 2009-01-05 23:59:59 
2009-01-06 00:00:00 2009-01-06 23:59:59 
2009-01-07 00:00:00 2009-01-07 23:59:59 
2009-01-08 00:00:00 2009-01-08 23:59:59 
2009-01-09 00:00:00 2009-01-09 23:59:59 
.
call make_intervals('2009-01-01 00:00:00','2009-01-01 02:00:00',10,'MINUTE')
. 
select * from time_intervals
.  
interval_start      interval_end        
------------------- ------------------- 
2009-01-01 00:00:00 2009-01-01 00:09:59 
2009-01-01 00:10:00 2009-01-01 00:19:59 
2009-01-01 00:20:00 2009-01-01 00:29:59 
2009-01-01 00:30:00 2009-01-01 00:39:59 
2009-01-01 00:40:00 2009-01-01 00:49:59 
2009-01-01 00:50:00 2009-01-01 00:59:59 
2009-01-01 01:00:00 2009-01-01 01:09:59 
2009-01-01 01:10:00 2009-01-01 01:19:59 
2009-01-01 01:20:00 2009-01-01 01:29:59 
2009-01-01 01:30:00 2009-01-01 01:39:59 
2009-01-01 01:40:00 2009-01-01 01:49:59 
2009-01-01 01:50:00 2009-01-01 01:59:59 
.
I specified an interval_start and interval_end so you can aggregate the 
data timestamps with a "between interval_start and interval_end" type of JOIN.
.
Code for the proc:
.
-- drop procedure make_intervals
.
CREATE PROCEDURE make_intervals(startdate timestamp, enddate timestamp, intval integer, unitval varchar(10))
BEGIN
-- *************************************************************************
-- Procedure: make_intervals()
--    Author: Ron Savage
--      Date: 02/03/2009
--
-- Description:
-- This procedure creates a temporary table named time_intervals with the
-- interval_start and interval_end fields specifed from the startdate and
-- enddate arguments, at intervals of intval (unitval) size.
-- *************************************************************************
   declare thisDate timestamp;
   declare nextDate timestamp;
   set thisDate = startdate;

   -- *************************************************************************
   -- Drop / create the temp table
   -- *************************************************************************
   drop temporary table if exists time_intervals;
   create temporary table if not exists time_intervals
      (
      interval_start timestamp,
      interval_end timestamp
      );

   -- *************************************************************************
   -- Loop through the startdate adding each intval interval until enddate
   -- *************************************************************************
   repeat
      select
         case unitval
            when 'MICROSECOND' then timestampadd(MICROSECOND, intval, thisDate)
            when 'SECOND'      then timestampadd(SECOND, intval, thisDate)
            when 'MINUTE'      then timestampadd(MINUTE, intval, thisDate)
            when 'HOUR'        then timestampadd(HOUR, intval, thisDate)
            when 'DAY'         then timestampadd(DAY, intval, thisDate)
            when 'WEEK'        then timestampadd(WEEK, intval, thisDate)
            when 'MONTH'       then timestampadd(MONTH, intval, thisDate)
            when 'QUARTER'     then timestampadd(QUARTER, intval, thisDate)
            when 'YEAR'        then timestampadd(YEAR, intval, thisDate)
         end into nextDate;

      insert into time_intervals select thisDate, timestampadd(MICROSECOND, -1, nextDate);
      set thisDate = nextDate;
   until thisDate >= enddate
   end repeat;

 END;

Similar example data scenario at the bottom of this post, where I built a similar function for SQL Server.

Ron

Ron Savage
A: 

Typically one would use an auxiliary numbers table you usually keep around for just this purpose with some variation on this:

SELECT *
FROM (
    SELECT DATEADD(d, number - 1, '2009-01-01') AS dt
    FROM Numbers
    WHERE number BETWEEN 1 AND DATEDIFF(d, '2009-01-01', '2009-01-13') + 1
) AS DateRange
LEFT JOIN YourStuff
    ON DateRange.dt = YourStuff.DateColumn

I've seen variations with table-valued functions, etc.

You can also keep a permanent list of dates. We have that in our data warehouse as well as a list of times of day.

Cade Roux
A: 

I would use something similar to this:

DECLARE @DATEFROM AS DATETIME
DECLARE @DATETO AS DATETIME
DECLARE @HOLDER AS TABLE(DATE DATETIME)

SET @DATEFROM = '2010-08-10'
SET @DATETO = '2010-09-11'

INTO
    @HOLDER
        (DATE)
VALUES
    (@DATEFROM)

WHILE @DATEFROM < @DATETO
BEGIN

    SELECT @DATEFROM = DATEADD(D, 1, @DATEFROM)
    INSERT 
    INTO
        @HOLDER
            (DATE)
    VALUES
        (@DATEFROM)
END

SELECT 
    DATE
FROM
    @HOLDER

Then the @HOLDER Variable table holds all the dates incremented by day between those two dates, ready to join at your hearts content.

Blue