I have a list of values and dates for stock ticker symbols and want to calculate quarterly returns in SQL.
CREATE TABLE `symbol_details` (
`symbol_header_id` INT(11) DEFAULT NULL,
`DATE` DATETIME DEFAULT NULL,
`NAV` DOUBLE DEFAULT NULL,
`ADJ_NAV` DOUBLE DEFAULT NULL)
For fixed quarter start and end dates that works fine:
set @quarterstart='2008-12-31';
set @quarterend='2009-3-31';
select sha, (100*(aend-abegin)/abegin) as q1_returns from
(select symbol_header_id as sha, ADJ_NAV as abegin from symbol_details
where date=@quarterstart) as a,
(select symbol_header_id as she, ADJ_NAV as aend from symbol_details
where date=@quarterend) as b
where sha=she;
This calculates all quarterly returns for all symbols. Sometimes that end of the quarter is a non-trading day, or the stock ceased operation, so I want to get the closest start and end dates to the quarter start and end dates.
The solution would be to somehow get only the one start and one end of quarter value for each symbol_header_id by some GROUP BY statement, like (1)
SET @quarterstart = '2009-03-01';
SET @quarterend = '2009-4-31';
SELECT symbol_header_id, DATE, ADJ_NAV AS aend FROM symbol_details
WHERE
DATE BETWEEN @quarterstart AND @quarterend
AND symbol_header_id BETWEEN 18540 AND 18550
GROUP BY symbol_header_id asc;
This gives the ADJ_NAV value for the date closest to the start of the quarter for each of the symbol_header_id values.
So finally, (2)
SET @quarterstart = '2008-12-31';
SET @quarterend = '2009-3-31';
SELECT sh1, a.date, b.date, aend, abegin,
(100*(aend-abegin)/abegin) AS quarter_returns FROM
(SELECT symbol_header_id sh1, DATE, ADJ_NAV AS abegin FROM symbol_details
WHERE
DATE BETWEEN @quarterstart AND @quarterend
GROUP BY symbol_header_id DESC) a,
(SELECT symbol_header_id sh2, DATE, ADJ_NAV AS aend FROM symbol_details
WHERE
DATE BETWEEN @quarterstart AND @quarterend
GROUP BY symbol_header_id ASC) b
WHERE sh1 = sh2;
should calculate the quarterly returns for each symbols.
Unfortunately this does not work. For some reason when I restrict the IDs like in (1) the correct start and end dates are used, but when the "AND symbol_header_id BETWEEN 18540 AND 18550" statements are removed, the same start and end dates come up. Why????
The answer with unrolled JOINs is:
SET @quarterstart = '2008-12-31';
SET @quarterend = '2009-3-31';
SELECT tq.sym AS sym,
(100*(alast.adj_nav - afirst.adj_nav)/afirst.adj_nav) AS quarterly_returns
FROM
-- First, determine first traded days ("ftd") and last traded days
-- ("ltd") in this quarter per symbol
(SELECT symbol_header_id AS sym,
MIN(DATE) AS ftd,
MAX(DATE) AS ltd
FROM symbol_details
WHERE DATE BETWEEN @quarterstart AND @quarterend
GROUP BY 1) tq
JOIN symbol_details afirst
-- Second, determine adjusted NAV for "ftd" per symbol (see WHERE)
ON afirst.DATE BETWEEN @quarterstart AND @quarterend
AND afirst.symbol_header_id = tq.sym
JOIN
-- Finally, determine adjusted NAV for "ltd" per symbol (see WHERE)
symbol_details alast
ON alast.DATE BETWEEN @quarterstart AND @quarterend
AND alast.symbol_header_id = tq.sym
WHERE
afirst.date = tq.ftd
AND
alast.date = tq.ltd;