views:

520

answers:

3

Similar to this question, I need to group a large number of records into 1-hour "buckets". For example, let's say I've got a typical ORDER table with a datetime attached to each order. And I want to see the total number of orders per hour. So I'm using SQL roughly like this:

SELECT datepart(hh, order_date), SUM(order_id)
FROM ORDERS
GROUP BY datepart(hh, order_date)

The problem is that if there are no orders in a given 1-hour "bucket", no row is emitted into the result set. I'd like the resultset to have a row for each of the 24 hour, but if no orders were made during a particular hour, just record the number of orders as O.

Is there any way to do this in a single query?

See also Getting Hourly Statistics Using SQL.

+3  A: 

You need to have a pre-populated table (or a function returning a table result set) to join with, that contains all the 1-hour slots you want in your result.

Then you do a OUTER JOIN with that, and you should get them all.

Something like this:

SELECT SLOT_HOUR, SUM(order_id)
FROM
    ONEHOURSLOTS
    LEFT JOIN ORDERS ON DATEPART(hh, order_date) = SLOT_HOUR
GROUP BY SLOT_HOUR
Lasse V. Karlsen
I believe you could also replace `ONEHOURSLOTS` with something like `(select 0 as SLOT_HOUR union select 1 as SLOT_HOUR union .. )` - only if you were dead set on keeping it in a single query though - yuck :P
Blorgbeard
You could do that even easier user generate_series() (assuming you're on postgres, which the tags say you are, but the example query wouldn't work there..)
Magnus Hagander
+1  A: 

Create a table of hours, either persisted or even synthesized 'on the fly':

SELECT h.hour, s.sum
FROM (
   SELECT 1 as hour
   UNION ALL SELECT 2
   UNION ALL SELECT 3
   ...
   UNION ALL SELECT 24) as h
LEFT OUTER JOIN  (
   SELECT datepart(hh, order_date) as hour, SUM(order_id) as sum
      FROM ORDERS
      GROUP BY datepart(hh, order_date) ) as s 
  ON h.hour = s.hour;
Remus Rusanu
+2  A: 

Some of the previous answers recommend using a table of hours and populating it using a UNION query; this can be better done with a Common Table Expression:

; WITH [Hours] ([Hour]) AS
(
SELECT TOP 24 ROW_NUMBER() OVER (ORDER BY [object_id]) AS [Hour]
FROM sys.objects
ORDER BY [object_id]
)
SELECT h.[Hour], o.[Sum]
FROM [Hours] h
LEFT OUTER JOIN (
   SELECT datepart(hh, order_date) as [Hour], SUM(order_id) as [Sum]
      FROM Orders
      GROUP BY datepart(hh, order_date) 
) o
ON h.[Hour] = o.[Hour]
Ken Keenan
+1 for the sys.objects trick. One day they'll ship a version that shows only user objects with access in sys.objects and move system objects into sys.all_objects, jut to ruin your scripts lol
Remus Rusanu
Hopefully, that'll be the version that has a numbers table (http://sqlserver2000.databases.aspfaq.com/why-should-i-consider-using-an-auxiliary-numbers-table.html) in it, so you won't need hacks like this
Ken Keenan