I think this is a pretty common issue, but I don't know what the process is called, so I'll describe it with an example. The concept is that I want to join a sparse dataset to a complete series, such as the days of the week, months of the year, or any ordered set (for example, for ranking). Empty positions in the sparse data will show as NULL alongside the complete series.
Let's say I run the following query in SQL Server to find out monthly sales.
SELECT
YEAR([timestamp]),
MONTH([timestamp]),
COUNT(*)
FROM table1
WHERE YEAR([timestamp]) = YEAR(GETDATE())
GROUP BY
YEAR([timestamp]),
MONTH([timestamp])
ORDER BY
YEAR([timestamp]) DESC,
MONTH([timestamp]) DESC;
If, however, sales only occurred in May and August of this year, for example, then the return result would look like this:
2010 August 1234
2010 May 5678
I want my return result set to look like this:
2010 January
2010 February
2010 March
2010 April
2010 May 1234
2010 June
2010 July
2010 August 5678
2010 September
2010 October
2010 November
2010 December
The only way I know to do this is this:
SELECT
YEAR(GETDATE()),
month_index.month_name,
sales_data.sales
FROM (
SELECT 'January' as month_name, 1 as month_number
UNION
SELECT 'February', 2
UNION
SELECT 'March', 3
UNION
SELECT 'April', 4
UNION
SELECT 'May', 5
UNION
SELECT 'June', 6
UNION
SELECT 'July', 7
UNION
SELECT 'August', 8
UNION
SELECT 'September', 9
UNION
SELECT 'October', 10
UNION
SELECT 'November', 11
UNION
SELECT 'December', 12
) as month_index
LEFT JOIN (
SELECT
YEAR([timestamp]) AS year_name,
MONTH([timestamp]) AS month_name,
COUNT(*) AS sales
FROM table1
WHERE YEAR([timestamp]) = GETDATE()
GROUP BY
YEAR([timestamp]),
MONTH([timestamp])
) AS sales_data
ON month_index.month_name = sales_data.month_name
ORDER BY
month_index.month_number DESC;
Is there a better way to create complete date and alphanumeric series to join data onto? And what is this called?
Thanks!