views:

181

answers:

5

We're extracting booking information for our tennis courts from our SQL database into a simple results table to help us build a picture of court usage. It's pretty straight-forward except when it comes to bookings that last more than an hour.

At present, each booking results in a row in our results table. Each row contains a start-time, duration and court number. We'd like to map this table directly into a spreadsheet or pivot table so we can see how many hours our courts are booked and which hours of the day.

Currently, our SQL query looks something like:

INSERT INTO Results (year, month, day, hour, duration, court)
SELECT DATEPART (yy, b.StartDateTime),
       DATEPART (mm, b.StartDateTime),
       DATEPART (dd, b.StartDateTime),
       DATEPART (hh, b.StartDateTime),
       a.Duration,
       a.Court
FROM Bookings b
INNER JOIN Activities a
ON b.ActivityID = a.ID

Our problem is that bookings for 2, 3 or more hours duration only have one row in the results table, ie. for the first hour of the booking. This is because the length of the booking is captured in the duration data. We could do some post-processing on the data to achieve our ends, but it would be easier if this was possible in our SQL query.

Can this query be modified in some way, such that depending on the duration (which can be 1, 2, 3, ... hours) the appropriate number of rows are inserted into the results table, each of duration 1. Thus a 3-hour booking starting at 9am, would results in three rows in the results table, one at 9am, one at 10am and one at 11am, each of duration 1-hour.

So instead of the following row in the results table:

Year, Month, Day, Hour, Duration, Court
2009,    08,  25,   09,        3,     1

we get the following rows:

Year, Month, Day, Hour, Duration, Court
2009,    08,  25,   09,        1,     1
2009,    08,  25,   10,        1,     1
2009,    08,  25,   11,        1,     1

This would make mapping the results table into a spreadsheet much easier.

UPDATE 2009-08-25: Of course, as the first couple of answers show, it doesn't have to be a single query. A set is fine.

UPDATE 2009-08-26: Have been side-tracked and haven't had a chance to try out the proposed solutions yet. Hope to do so soon and will select an answer based on the results.

UPDATE 2009-08-27: Finally got a chance to try out the solutions. The table of integers and joining to produce a solution was an eye-opener. Especially the use of cross-joins to create such a table. This is probably the cleaner, SQL way of doing things.

However, in the end, I went with Aaron's solution involving the flag and the simple algorithm. I did enhance it by wrapping his algorithm in a while loop to keep iterating until no durations > 1 were left. This was quick and easy to implement. It also highlighted that we did have some 10 hour bookings, so I didn't need to hard-code a limit here.

I should note that I incorporated Jeff's idea of max duration into the while loop counter, rather than my original idea of count the items with duration > 1. Slightly less code.

A: 

It's not trivial. First, you need another column "Flag" which is 0:

INSERT INTO Results (year, month, day, hour, duration, court, Flag)
SELECT DATEPART (yy, b.StartDateTime),
       DATEPART (mm, b.StartDateTime),
       DATEPART (dd, b.StartDateTime),
       DATEPART (hh, b.StartDateTime),
       a.Duration,
       a.Court,
       0
FROM Bookings b
INNER JOIN Activities a
ON b.ActivityID = a.ID

You need to run these queries several times:

-- Copy all rows with duration > 1 and set the flag to 1
insert into results(year, month, day, hour, duration, court, Flag)
select year, month, day, hour+1, duration-1, court, 1
from result
where duration > 1
;
-- Set the duration of all copied rows to 1
update result
set duration = 1
where flag = 0 and duration > 1
;
-- Prepare the copies for the next round
update result
set flag = 0
where flag = 1

This will create an additional entry for each duration > 1. My guess is that you can't allocate a court for more than 8 hours, so you just need to run these three 8 times to fix all of them.

Aaron Digulla
I think I follow this. But doesn't your second query need to reduce duration by 1, not set it to exactly 1?
dave
@dave: No, it takes all processes queries out of the picture by setting the duration to 1. All results with "flag = 1" will contain the new "duration-1". That's why I need the flag (to know which duration to "delete")
Aaron Digulla
@Aaron, I get it now. Once I started an example in my head it was obvious. I had assumed the "iteration" would be based on the "parent" row, ie. the original row with the duration > 1. But your solution creates a child row to iterate off. Which is quite clever. Perhaps I'll wrap the whole thing in a while loop, that checks to see if there's any rows with duration > 1 left. That way, I don't have to worry about the maximum number of hours that can be booked or wasted iterations.
dave
+1  A: 

Edited to correct the missing hour calculation

Create a single column temporary table with n rows for integer n - (I've assumed that the maximum booking time is 8 hours).

create table #t
(id int
,addHour int
)

insert #t
select 1,0
union all select 2,0
union all select 2,1
union all select 3,0
union all select 3,1
union all select 3,2
union all select 4,0
union all select 4,1
union all select 4,2
union all select 4,3
union all select 5,0
union all select 5,1
union all select 5,2
union all select 5,3
union all select 5,4
union all select 6,0
union all select 6,1
union all select 6,2
union all select 6,3
union all select 6,4
union all select 6,5
union all select 7,0
union all select 7,1
union all select 7,2
union all select 7,3
union all select 7,4
union all select 7,5
union all select 7,6
union all select 8,0
union all select 8,1
union all select 8,2
union all select 8,3
union all select 8,4
union all select 8,5
union all select 8,6
union all select 8,7

You can validate that the temporary table has the correct number of rows with the following query:

select id, count(1)  
from #t
group by id
order by id

Amend your query to include a join to the temporary table:

INSERT INTO Results (year, month, day, hour, duration, court)
SELECT DATEPART (yy, b.StartDateTime),
       DATEPART (mm, b.StartDateTime),
       DATEPART (dd, b.StartDateTime),
       DATEPART (hh, b.StartDateTime) + addHour,
       1 AS Duration,
       a.Court 
FROM Bookings b
INNER JOIN Activities a
ON b.ActivityID = a.ID
INNER JOIN #t AS t
ON t.id = a.Duration

EDIT - a clarification on how this works

When joining tables, a row is produced in the output for each combination of joined rows in the source tables which meet the join criteria.

I'm using the temporary table to "multiply" the original result set from Bookings and Activities by the number of hours that the booking lasts by joining on Duration. This only works if bookings are made in whole numbers of hours.

If you want to see this more clearly, add a second column to #t which uniquely identifies each row and include it in the output result set:

create table #t
(id int
,unique_id int identity
)

INSERT #t (id)
select 1
union all select 2
... etc

SELECT DATEPART (yy, b.StartDateTime),
       DATEPART (mm, b.StartDateTime),
       DATEPART (dd, b.StartDateTime),
       DATEPART (hh, b.StartDateTime) + addHour,
       1 AS Duration,
       a.Court,
       t.unique_id
FROM Bookings b
INNER JOIN Activities a
ON b.ActivityID = a.ID
INNER JOIN #t AS t
ON t.id = a.Duration

This should clarify that each row in the result set is produced from a single valid combination of Bookings, Activities and #t.

Ed Harper
Can you explain a little how this works? I understand the creation of the temporary table, but not how the join of #t produces the desired result.
dave
This doesn't give the correct hour in the results.
Aaron Digulla
@Aaron - good point. I've amended the query to handle this.
Ed Harper
A: 

You might consider putting an INSTEAD OF INSERT trigger on the table "Results" that would insert multiple rows for each more-than-one-hour booking. This does add complexity, but it might be a reasonable approach, since this doesn't sound like a high-volume OLTP system.

Steve Kass
+1  A: 

A slight modification to your original suffices, if you introduce an integers table (or VIEW) to serve as a series generator:

INSERT INTO Results (year, month, day, hour, duration, court)
SELECT DATEPART (yy, b.StartDateTime),
       DATEPART (mm, b.StartDateTime),
       DATEPART (dd, b.StartDateTime),
       DATEPART (hh, b.StartDateTime) + (a.Duration - i.iii - 1)
       1,
       a.Court
FROM Bookings b
INNER JOIN Activities a
  ON b.ActivityID = a.ID
INNER JOIN Integers999 i       -- N.B.: Integers999 (iii INT), all numbers 0 .. 999
  ON a.Duration > i.iii;       -- So, a true Duration of 1 appears once, of 2 twice ...
pilcrow
You forgot to fix the start hour.
Aaron Digulla
@Aaron, so I did, thank you. Fixed.
pilcrow
A: 

I haven't had a chance to debug this, but something like this should do it for you:

DECLARE @maxDuration    INTEGER
DECLARE @curDuration    INTEGER

SELECT @MaxDuration = SELECT MAX(Duration) FROM Activities
SET @curDuration = 1

WHILE @curDuration <= @MaxDuration
BEGIN
    INSERT INTO Results (year, month, day, hour, duration, court)
    SELECT DATEPART (yy, b.StartDateTime),
        DATEPART (mm, b.StartDateTime),
           DATEPART (dd, b.StartDateTime),
           DATEPART (hh, b.StartDateTime) + @curDuration - 1,
           a.Duration,
           a.Court
    FROM Bookings b
    INNER JOIN Activities a
    ON b.ActivityID = a.ID
    WHERE a.Duration <= @MaxDuration

    SET @curDuration = @curDuration + 1
END
Jeff Hornby