tags:

views:

122

answers:

5
+1  Q: 

SQL join puzzler

Given

pricetable  
sym(k)   pricedate(k) price
msft     1/2/2009     33
msft     1/3/2009     34
msft     1/4/2009     35
ibm   1/2/2009       66
ibm   1/3/2009       65
ibm   1/4/2009       64

and

datestable
pricedate(k)
1/1/2009
1/2/2009
1/3/2009
1/4/2009

I would like to get this result set

sym pricedate price
msft 1/1/2009 null
msft 1/2/2009 33
msft 1/3/2009 34
msft 1/4/2009 35
ibm  1/1/2009  null
ibm  1/2/2009  66
ibm  1/3/2009  65
ibm  1/4/2009  64

This is a one-time dump, so I don't mind creating intermediate temporary tables if that'll get it done.

FWIW, I am not just dumping this problem on SO, I have tried a bunch of things, but can't get it to work.

TIA

+2  A: 
 SELECT syms.sym, pt.pricedate, syms.price FROM
 (SELECT DISTINCT sym, datestable.priceDate FROM priceTable, datestable) syms 
     LEFT JOIN priceTable pt on pt.priceDate = syms.priceDate

should do it.

At least it works under SQL Server.

Dana
',' join notation? ugh.
Joel Coehoorn
Ehh, it's how I think of cross joins in sql server.
Dana
+3  A: 
SELECT b.sym, b.pricedate, p.price 
FROM 
(
    SELECT DISTINCT sym, pricedate
    FROM pricetable CROSS JOIN datestable
) b  /* base table */
LEFT JOIN pricetable p ON p.sym = b.sym AND p.pricedate = b.pricedate
ORDER BY b.pricedate, b.sym
Joel Coehoorn
Yep, that did it!
Funny - the ORDER BY seems backward from the results shown in the question, and it also seems to need b.sym DESC! :)
Jonathan Leffler
+1  A: 

Untested...key observation, you seem to need a row for each stock for each possible day.

SELECT k1.sym, k1,pricedate, k1.price
  FROM DatesTable JOIN (SELECT DISTINCT(sym) FROM PriceTable) AS k1 ON 1 = 1
       LEFT JOIN Pricetable AS pt
                 ON (k1.pricedate = pt.pricedate AND k1.sym = pt.sym)
 ORDER BY k1.sym DESC, k1.pricedate;

An alternative to the 'JOIN ... ON 1 = 1' is 'CROSS JOIN':

SELECT k1.sym, k1,pricedate, k1.price
  FROM DatesTable CROSS JOIN (SELECT DISTINCT(sym) FROM PriceTable) AS k1
       LEFT JOIN Pricetable AS pt
                 ON (k1.pricedate = pt.pricedate AND k1.sym = pt.sym)
 ORDER BY k1.sym DESC, k1.pricedate;
Jonathan Leffler
A: 

Have you tried

select a.sym, a.pricedate, p2.price
(select distinct p.sym, d.pricedate
from pricetable p
cross join
datestable d )a
left join pricetable p2
on a.sym = p2.sym and a.pricedate = p2.pricedate

or

select a.sym, a.pricedate, p2.price (select p.sym, d.pricedate from (select distinct sym from pricetable) p cross join datestable d )a left join pricetable p2 on a.sym = p2.sym and a.pricedate = p2.pricedate

the cross join should get a complete list of syms and dates then by left joining tothe price table again you should get the rest of the column data for those records which havea price.

HLGEM
We have to hope the optimizer pushes the SELECT DISTINCT down to PriceTable before doing the CROSS JOIN to avoid a large intermediate table.
Jonathan Leffler
Well could do that too.
HLGEM
+1  A: 

This seems to work:

DECLARE @pricetable TABLE (sym VARCHAR(10), pricedate DATETIME, price INT)
INSERT INTO @pricetable (sym, pricedate, price) VALUES('msft', '1/2/9', 33)
INSERT INTO @pricetable (sym, pricedate, price) VALUES('msft', '1/3/9', 34)
INSERT INTO @pricetable (sym, pricedate, price) VALUES('msft', '1/4/9', 35)
INSERT INTO @pricetable (sym, pricedate, price) VALUES('ibm', '1/2/9', 66)
INSERT INTO @pricetable (sym, pricedate, price) VALUES('ibm', '1/3/9', 65)
INSERT INTO @pricetable (sym, pricedate, price) VALUES('ibm', '1/4/9', 64)

DECLARE @datestable TABLE (pricedate DATETIME)
INSERT INTO @datestable (pricedate) VALUES('1/1/9')
INSERT INTO @datestable (pricedate) VALUES('1/2/9')
INSERT INTO @datestable (pricedate) VALUES('1/3/9')
INSERT INTO @datestable (pricedate) VALUES('1/4/9')


SELECT AllDates.sym, AllDates.pricedate, outerp.price
    FROM 
     (
      SELECT 
       d.pricedate, 
       p.sym
      FROM 
       @datestable AS d 
       CROSS JOIN @pricetable AS p
      GROUP BY 
       d.pricedate, 
       p.sym
     ) AllDates
     LEFT OUTER JOIN @pricetable AS outerp ON
      outerp.pricedate = AllDates.pricedate
      AND outerp.sym = AllDates.sym
ORDER BY AllDates.sym, AllDates.pricedate

Results:

sym        pricedate               price
---------- ----------------------- -----------
ibm        2009-01-01 00:00:00.000 NULL
ibm        2009-01-02 00:00:00.000 66
ibm        2009-01-03 00:00:00.000 65
ibm        2009-01-04 00:00:00.000 64
msft       2009-01-01 00:00:00.000 NULL
msft       2009-01-02 00:00:00.000 33
msft       2009-01-03 00:00:00.000 34
msft       2009-01-04 00:00:00.000 35
recursive