views:

115

answers:

3

I'm looking to add some new functionality to an existing application (database is Microsoft SQL 2005). Basically, I'm trying to calculate how many minutes (or seconds) a particular department was "unmanned" for a particular set of date ranges. I'm looking to query the dataset ideally with one statement. I have a routine that loops through the record set, parses it and spits out an answer, but it's pretty ugly. Does anyone have any suggestions on how I can optimise it for readability, using pure SQL - or even any pointers/articles on what I should be looking at, my Googlefu is failing me.

I guess in some ways this is almost like a "free time" search of a calendar, but aggregated.

Here is a mock sample data set to give you an idea of what I'm working with (effectively colleagues clock in, then clock out). I'm using rounding to minutes below for the sake of simplicity, but I'd likely be calculating in seconds.

------------------------------------------------------------------------
| Colleague Id | Department Id   | Date In          | Date Out         |
------------------------------------------------------------------------
| 1            | 1               | 04/01/2010 08:45 | 04/01/2010 11:45 |
| 2            | 1               | 04/01/2010 09:00 | 04/01/2010 12:15 |
| 3            | 1               | 04/01/2010 10:00 | 04/01/2010 12:00 |
| 4            | 1               | 04/01/2010 12:30 | 04/01/2010 17:00 |
| 1            | 1               | 04/01/2010 12:45 | 04/01/2010 17:15 |
| 3            | 1               | 04/01/2010 13:00 | 04/01/2010 17:25 |
| 5            | 2               | ...              | ...              |
------------------------------------------------------------------------

So for example, if I queried the above table for Department Id = 1, between 04/01/2010 08:30:00 and 04/01/2010 17:30:00, I would expect a result of 35 minutes (or 2100 seconds) of "unmanned time" (this is sum of the time at the start, middle and end of the range that is unmanned).

+1  A: 

It's a range intersection problem: You're looking at a number range:

4/01/2010 08:30:00 - 04/01/2010 17:30:00  

this range can be represented as numbers - microseconds, or seconds from beginning of the day, for example:

[1000000, 3000000]

and you want to find the parts of it that do not collide with any of:

[1200000, 1250000]
[1250000, 1490000]
[1500000, 1950000]
...

When translated to number format, it would really look like this range intersection algorithm, and it can be implemented in virtually any language.

Edit:

There's a very interesting discussion about date ranges with great illustrations and explanations here.

Adam Matan
Thanks Adam for linking those items for further reading, very interesting!
benno
+3  A: 

I have a table Integers already created, which I use for stuff like this.

Given that, you want:

drop table foo 
go

create table foo (
   c_id int not null,
   d_id int not null,
   datein datetime not null,
   dateout datetime not null
)
go


insert into foo values (1, 1, '04/01/2010 08:45', '04/01/2010 11:45')
insert into foo values (2, 1, '04/01/2010 09:00', '04/01/2010 12:15')
insert into foo values (3, 1, '04/01/2010 10:00', '04/01/2010 12:00')
insert into foo values (4, 1, '04/01/2010 12:30', '04/01/2010 17:00')
insert into foo values (1, 1, '04/01/2010 12:45', '04/01/2010 17:15')
insert into foo values (3, 1, '04/01/2010 13:00', '04/01/2010 17:25')
go


drop procedure unmanned
go

create procedure unmanned
   @d_id int,
   @start datetime,
   @end datetime

as

select distinct dateadd(ss,i_int,@start)
 from Integers 
      left join foo on dateadd(ss,i_int,@start) >= datein and dateadd(ss,i_int,@start) < dateout


where i_int between 0 and 60*60*24
and dateadd(ss,i_int,@start) >= @start and dateadd(ss,i_int,@start)< @end
and datein is null
order by 1

go

exec unmanned 1, '4/1/10 8:30', '4/1/10 17:30'
Eric H.
A: 

I'd recommend using Eric H's approach. With that disclaimer out of the way, this is kinda nasty but it does offer a means of doing the same thing, if you don't have access to a numbers table for one reason or another. I'm sure it can be improved, I just felt like trying it out w/o using the numbers table:

Declare @Start DateTime, @End DateTime

Select @Start = '04/01/2010 09:30'
    , @End = '04/01/2010 17:30'

--Table Creation Stuff
Declare @y Table (ColleagueId Int, DepartmentId Int, DateIn DateTime, DateOut DateTime)

Insert @y
Select 1, 1, '04/01/2010 08:45' , '04/01/2010 11:45'
Union All Select 2 , 1, '04/01/2010 09:00' , '04/01/2010 12:15'
Union All Select 3 , 1, '04/01/2010 10:00' , '04/01/2010 12:00'
Union All Select 4 , 1, '04/01/2010 12:30' , '04/01/2010 17:00' 
Union All Select 1 , 1, '04/01/2010 12:45' , '04/01/2010 17:15' 
Union All Select 3 , 1, '04/01/2010 13:00' , '04/01/2010 17:25'
---------

Select DateDiff(minute, @Start, @End)  -- TotalTime
     - Sum(DateDiff(minute, 
        Case When DateIn < @Start Then @Start Else DateIn End, 
        Case When DateOut > @End Then @End Else DateOut End)) --StaffedTime
     as UnmannedTime
From
(
    Select Min(din) DateIn, dout DateOut
    From
    (
        Select Min(y.DateIn) din, Max(y2.DateOut) dout
        From @y y
        Inner Join @y y2 on y.DateOut >= y2.DateIn
        --you probably want to close the other end of these filters, but leave some room
        --(to handle the guy who started @ 7:45, etc...)
        Where y.DateIn < @End 
            and y2.DateOut > @Start              
        Group By y.DateIn
    ) x 
    Group By dout
) q

edit added the case statements above to handle calculation of StaffedTime when a particular period starts prior to @Start (or ends after @End)

AlexCuse
Thanks AlexCuse, I've adopted Eric H's approach, however yours provided an interesting implementation.
benno
glad you were able to use the numbers table approach, I'm often surprised to hear people aren't able to (because they're working with a vendor database, or maybe they are afraid to ask the DBA to add one ;) )
AlexCuse
Depending on how often you use the stored proc, you can always create the numbers table on the fly using a cartesian join.......
Eric H.
Indeed, I've yet to see a *really* fast method of generating large numbers tables on the fly though (there's a recursive CTE out there somewhere, but I can't find it through search at the moment). You shouldn't need one very large at all for this, but you never know. It'd be interesting to see how this query performs vs. the on-the-fly numbers table approach.
AlexCuse