views:

890

answers:

6

I've written a query that groups the number of rows per hour, based on a given date range.

SELECT CONVERT(VARCHAR(8),TransactionTime,101) + ' ' + CONVERT(VARCHAR(2),TransactionTime,108) as TDate, 
    COUNT(TransactionID) AS TotalHourlyTransactions
    FROM MyTransactions WITH (NOLOCK)
    WHERE TransactionTime BETWEEN CAST(@StartDate AS SMALLDATETIME) AND CAST(@EndDate AS SMALLDATETIME)
    AND TerminalId = @TerminalID
    GROUP BY CONVERT(VARCHAR(8),TransactionTime,101) + ' ' + CONVERT(VARCHAR(2),TransactionTime,108)
    ORDER BY TDate ASC

Which displays something like this:

02/11/20 07 4
02/11/20 10 1
02/11/20 12 4
02/11/20 13 1
02/11/20 14 2
02/11/20 16 3

Giving the number of transactions and the given hour of the day.

How can I display all hours of the day - from 0 to 23, and show 0 for those which have no values?

Thanks.

UPDATE

Using the tvf below works for me for one day, however I'm not sure how to make it work for a date range.

Using the temp table of 24 hours:

 -- temp table to store hours of the day    
 DECLARE @tmp_Hours TABLE ( WhichHour SMALLINT )

 DECLARE @counter SMALLINT
 SET @counter = -1
 WHILE @counter < 23 
    BEGIN
        SET @counter = @counter + 1
      --print 
        INSERT  INTO @tmp_Hours
                ( WhichHour )
        VALUES  ( @counter )
    END 

    SELECT MIN(CONVERT(VARCHAR(10),[dbo].[TerminalTransactions].[TransactionTime],101)) AS TDate, [@tmp_Hours].[WhichHour], CONVERT(VARCHAR(2),[dbo].[TerminalTransactions].[TransactionTime],108) AS TheHour,
        COUNT([dbo].[TerminalTransactions].[TransactionId]) AS TotalTransactions, 
        ISNULL(SUM([dbo].[TerminalTransactions].[TransactionAmount]), 0) AS TransactionSum
    FROM [dbo].[TerminalTransactions] RIGHT JOIN @tmp_Hours ON [@tmp_Hours].[WhichHour] = CONVERT(VARCHAR(2),[dbo].[TerminalTransactions].[TransactionTime],108) 
    GROUP BY [@tmp_Hours].[WhichHour], CONVERT(VARCHAR(2),[dbo].[TerminalTransactions].[TransactionTime],108),  COALESCE([dbo].[TerminalTransactions].[TransactionAmount], 0)

Gives me a result of:

TDate      WhichHour TheHour TotalTransactions TransactionSum
---------- --------- ------- ----------------- ---------------------
02/16/2010 0         00      4                 40.00
NULL       1         NULL    0                 0.00
02/14/2010 2         02      1                 10.00
NULL       3         NULL    0                 0.00
02/14/2010 4         04      28                280.00
02/14/2010 5         05      11                110.00
NULL       6         NULL    0                 0.00
02/11/2010 7         07      4                 40.00
NULL       8         NULL    0                 0.00
02/24/2010 9         09      2                 20.00

So how can I get this to group properly?

The other issue is that for some days there will be no transactions, and these days also need to appear.

Thanks.

A: 

group by datepart('hour', thetime). to show those hours with no values you'd have to left join a table of times against the grouping (coalesce(transaction.amount, 0))

A: 

I've run into a version of this problem before. The suggestion that worked the best was to setup a table (temporary, or not) with the hours of the day, then do an outer join to that table and group by datepart('h', timeOfRecord).

I don't remember why, but probably due to lack of flexibility because of the need for the other table, I ended up using a method where I group by whatever datepart I want and order by the datetime, then loop through and fill any spaces that are skipped with a 0. This approach worked well for me because I'm not reliant on the database to do all my work for me, and it's also MUCH easier to write an automated test for it.

RKitson
+3  A: 

You have just discovered the value of the NUMBERS table. You need to create a table with a single column containing the numbers 0 to 23 in it. Then you join again this table using an OUTER join to ensure you always get 24 rows returned.

jmucchiello
zackly what I was going to type :)
Mark Schultheiss
+2  A: 

You do this by building first the 23 hours table, the doing an outer join against the transactions table. I use, for same purposes, a table valued function:

create function tvfGetDay24Hours(@date datetime)
returns table
as return (
select dateadd(hour, number, cast(floor(cast(@date as float)) as datetime)) as StartHour
  , dateadd(hour, number+1, cast(floor(cast(@date as float)) as datetime)) as EndHour
from master.dbo.spt_values
where number < 24 and type = 'p');

Then I can use the TVF in queries that need to get 'per-hour' basis data, even for missing intervals in the data:

select h.StartHour, t.TotalHourlyTransactions
from tvfGetDay24Hours(@StartDate) as h
outer apply (
  SELECT 
    COUNT(TransactionID) AS TotalHourlyTransactions
    FROM MyTransactions 
    WHERE TransactionTime BETWEEN h.StartHour and h.EndHour
    AND TerminalId = @TerminalID) as t
order by h.StartHour

Updated

Example of a TVF that returns 24hours between any arbitrary dates:

create function tvfGetAnyDayHours(@dateFrom datetime, @dateTo datetime)
returns table
as return (
select dateadd(hour, number, cast(floor(cast(@dateFrom as float)) as datetime)) as StartHour
  , dateadd(hour, number+1, cast(floor(cast(@dateFrom as float)) as datetime)) as EndHour
from master.dbo.spt_values
where type = 'p'
and number < datediff(hour,@dateFrom, @dateTo) + 24);

Note that since master.dbo.spt_values contains only 2048 numbers, the function will not work between dates further apart than 2048 hours.

Remus Rusanu
+1, I was thinking a CTE, but this table function would integrate into queries much better!
KM
Thanks to all for the suggestions. Just before I left last night I was thinking of the temp table with values 0-23, as Mark suggested above. Taking this one step further, how would you do this for days and years?
ElHaix
Remus, this last query will return the values for one day. However, how would you make use of the tvf for a date range?
ElHaix
You can have a tvf that accept two dates and return every hour between these two dates, so you can use it for arbitrary ranges. Keep in mind though that a very large date range (years) may impact performance, since the OUTER APPLY operator will run the inner query *for every hour* in the range.
Remus Rusanu
Remus, I see, so roughly 85 days is would be the limit. What I'd like to do is be able to reuse this function inside a date range set. Since the set could pass each individual day in the range, what would be the script for this?
ElHaix
A: 

So going back to using Remus' original function, I've re-used it in a recursive call and storing the results in a temp table:

DECLARE @count INT
DECLARE @NumDays INT
DECLARE @StartDate DATETIME
DECLARE @EndDate DATETIME
DECLARE @CurrentDay DATE

    DECLARE @tmp_Transactions TABLE 
    (
        StartHour DATETIME,
        TotalHourlyTransactions INT
    )   

SET @StartDate = '2000/02/10'
SET @EndDate = '2010/02/13'
SET @count = 0
SET @NumDays = DateDiff(Day, @StartDate, @EndDate)
WHILE @count < @NumDays 
    BEGIN
        SET @CurrentDay = DateAdd(Day, @count, @StartDate)
        INSERT INTO @tmp_Transactions (StartHour, TotalHourlyTransactions)
            SELECT  h.StartHour ,
                    t.TotalHourlyTransactions
            FROM    tvfGetDay24Hours(@CurrentDay) AS h
                    OUTER APPLY ( SELECT    COUNT(TransactionID) AS TotalHourlyTransactions
                                  FROM      [dbo].[TerminalTransactions]
                                  WHERE     TransactionTime BETWEEN h.StartHour AND h.EndHour
                                            AND TerminalId = 4
                                ) AS t
            ORDER BY h.StartHour
        SET @count = @Count + 1
    END 

SELECT *
FROM @tmp_Transactions
ElHaix
A: 

Remus R, I applied your function to the AddressID column of the [AdventureWorks].[Person].[Address] table to get a larger range, but the 00:00:00 hour isn't included. Is there a reason for this?

Ackrite55
Please don't post your *comment* as an *answer*
Joe Philllips