views:

195

answers:

2

I have a table with sales per store as follows:

SQL> select * from sales;

        ID ID_STORE DATE       TOTAL
---------- -------- ---------- -------------------------------
         1        1 2010-01-01    500.00
         2        1 2010-01-02    185.00
         3        1 2010-01-03    135.00
         4        1 2009-01-01    165.00
         5        1 2009-01-02    175.00
         6        5 2010-01-01    130.00
         7        5 2010-01-02    135.00
         8        5 2010-01-03    130.00
         9        6 2010-01-01    100.00
         10       6 2010-01-02     12.00
         11       6 2010-01-03     85.00
         12       6 2009-01-01    135.00
         13       6 2009-01-02    400.00
         14       6 2009-01-07     21.00
         15       6 2009-01-08     45.00
         16       8 2009-01-09    123.00
         17       8 2009-01-10    581.00

17 rows selected.

What I need to do is to compare two date ranges within that table. Lets say I need to know the differences in sales between 01 Jan 2009 to 10 Jan 2009 AGAINST 01 Jan 2010 to 10 Jan 2010.

I'd like to build a query that returns something like this:

ID_STORE_A DATE_A     TOTAL_A   ID_STORE_B DATE_B     TOTAL_B
---------- ---------- --------- ---------- ---------- -------------------
         1 2010-01-01    500.00          1 2009-01-01    165.00
         1 2010-01-02    185.00          1 2009-01-02    175.00
         1 2010-01-03    135.00          1 NULL          NULL

         5 2010-01-01    130.00          5 NULL          NULL
         5 2010-01-02    135.00          5 NULL          NULL
         5 2010-01-03    130.00          5 NULL          NULL

         6 2010-01-01    100.00          6 2009-01-01    135.00
         6 2010-01-02     12.00          6 2009-01-02    400.00
         6 2010-01-03     85.00          6 NULL          NULL
         6 NULL          NULL            6 2009-01-07     21.00
         6 NULL          NULL            6 2009-01-08     45.00
         6 NULL          NULL            8 2009-01-09    123.00
         6 NULL          NULL            8 2009-01-10    581.00

So, even if there are no sales in one range or another, it should just fill the empty space with NULL.

So far, I've come up with this quick query, but I the "dates" from sales to sales2 sometimes are different in each row:

SELECT sales.*, sales2.*
  FROM sales
  LEFT JOIN sales AS sales2 
    ON (sales.id_store=sales2.id_store)
 WHERE sales.date >= '2010-01-01' 
   AND sales.date <= '2010-01-10' 
   AND sales2.date >= '2009-01-01' 
   AND sales2.date <= '2009-01-10' 
ORDER BY sales.id_store ASC, sales.date ASC, sales2.date ASC

What am I missing?

+1  A: 

I think the problem is in your join condition. I haven't tested it but I think you could try sth like

... ON (    sales.id_store = sales2.id_store 
        AND sales.date = ADDDATE(sales2.date, INTERVAL 1 YEAR) 
        ...
       )
PierrOz
+2  A: 

Using IBM Informix Dynamic Server 11.50.FC6, I can use this SQL sequence to get the result you require:

Setup

CREATE TABLE sales
(
    id       INTEGER NOT NULL,
    id_store INTEGER NOT NULL,
    date     DATE NOT NULL,
    total    DECIMAL(10,2) NOT NULL
);

INSERT INTO sales VALUES( 1, 1, '2010-01-01', 500.00);
INSERT INTO sales VALUES( 2, 1, '2010-01-02', 185.00);
INSERT INTO sales VALUES( 3, 1, '2010-01-03', 135.00);
INSERT INTO sales VALUES( 4, 1, '2009-01-01', 165.00);
INSERT INTO sales VALUES( 5, 1, '2009-01-02', 175.00);
INSERT INTO sales VALUES( 6, 5, '2010-01-01', 130.00);
INSERT INTO sales VALUES( 7, 5, '2010-01-02', 135.00);
INSERT INTO sales VALUES( 8, 5, '2010-01-03', 130.00);
INSERT INTO sales VALUES( 9, 6, '2010-01-01', 100.00);
INSERT INTO sales VALUES(10, 6, '2010-01-02',  12.00);
INSERT INTO sales VALUES(11, 6, '2010-01-03',  85.00);
INSERT INTO sales VALUES(12, 6, '2009-01-01', 135.00);
INSERT INTO sales VALUES(13, 6, '2009-01-02', 400.00);
INSERT INTO sales VALUES(14, 6, '2009-01-07',  21.00);
INSERT INTO sales VALUES(15, 6, '2009-01-08',  45.00);
INSERT INTO sales VALUES(16, 8, '2009-01-09', 123.00);
INSERT INTO sales VALUES(17, 8, '2009-01-10', 581.00);

Query

SELECT *
  FROM (SELECT s1.id AS s1id,
               NVL(s1.id_store, s2.id_store) AS s1store,
               NVL(s1.date, MDY(MONTH(s2.date), DAY(s2.date),
                                YEAR(s2.date)+1)) AS s1date,
               s1.total AS s1total,
               s2.id AS s2id,
               NVL(s2.id_store, s1.id_store) AS s2store,
               NVL(s2.date, MDY(MONTH(s1.date), DAY(s1.date),
                                YEAR(s1.date)-1)) AS s2date,
               s2.total AS s2total
          FROM sales AS s1 FULL JOIN sales AS s2
            ON s1.id_store = s2.id_store
           AND s1.date BETWEEN '2010-01-01' AND '2010-01-10'
           AND s2.date BETWEEN '2009-01-01' AND '2009-01-10'
           AND DAY(s1.date)   = DAY(s2.date)
           AND MONTH(s1.date) = MONTH(s2.date)
       ) AS s3
 WHERE s1_date BETWEEN '2010-01-01' AND '2010-01-10'
   AND s2_date BETWEEN '2009-01-01' AND '2009-01-10'
 ORDER BY s1_id_store ASC, s1_date ASC;

Result

s1id s1store  s1date     s1total  s2id s2store  s2date     s2total
 1       1    2010-01-01  500.00   4       1    2009-01-01  165.00
 2       1    2010-01-02  185.00   5       1    2009-01-02  175.00
 3       1    2010-01-03  135.00           1    2009-01-03             
 6       5    2010-01-01  130.00           5    2009-01-01             
 7       5    2010-01-02  135.00           5    2009-01-02             
 8       5    2010-01-03  130.00           5    2009-01-03             
 9       6    2010-01-01  100.00  12       6    2009-01-01  135.00
10       6    2010-01-02   12.00  13       6    2009-01-02  400.00
11       6    2010-01-03   85.00           6    2009-01-03             
         6    2010-01-07          14       6    2009-01-07   21.00
         6    2010-01-08          15       6    2009-01-08   45.00
         8    2010-01-09          16       8    2009-01-09  123.00
         8    2010-01-10          17       8    2009-01-10  581.00

Explanation

It took a fair amount of experimentation to get this 'right'. Informix has a DATE constructor function MDY() which takes three integer arguments: the month, day and year (the name is mnemonic). It also has three analysis functions: DAY(), MONTH() and YEAR() which return the day, month and year of the date argument. The inner query with the FULL JOIN gives you the results with nulls on both left and right sides. The 5-part criterion in the ON clause seems to be necessary; otherwise, the criteria in the outer query has to be more complex and confusing - if it can be made to work at all. Then the criteria in the outer selection ensure that the right data is chosen. One advantage of the NVL() expressions in the inner query is that the store ID columns are both the same and not null and neither date column is null, so the order by clause can be simpler - on store ID and either date column.

In Informix, it would also be possible to to rework the date expressions as:

NVL(s1.date, s2.date + 1 UNITS YEAR)
NVL(s2.date, s1.date - 1 UNITS YEAR)

There are actually multiple type conversions going on behind the scenes with that notation, but it gives you the same result and the extra calculation is probably not all that significant.

There is also a glitch in waiting in Informix; you cannot add or subtract 1 year to or from any February 29th - because there is no 29th February in the following or previous year. You would need to be careful with your data; if you're not, you could end up comparing the data for 2008-02-29 with 2009-02-28 (as well as comparing the data for 2008-02-28 with 2009-02-28). There is a process called 'double entry bookkeeping', but this isn't what is meant by it, and your computations could be confused if '2008-02-29 plus 1 year' is 2009-02-28. Informix generates an error; that isn't very much more helpful. You might code a stored procedure, probably, to return NULL for 2008-02-29 plus 1 year since there isn't any date to compare its sales with.

You should be able to adapt the date arithmetic to MySQL fairly easily; the rest of the code does not need to to change.

Jonathan Leffler