views:

2518

answers:

3

I am running into a road block on a larger problem.

As part of a large query I need to solve a "night watchman" problem. I have a table with schedule shifts as such:

ID | Start          | End
1  | 2009-1-1 06:00 | 2009-1-1 14:00
2  | 2009-1-1 10:00 | 2009-1-1 18:00
3  | 2009-2-1 20:00 | 2009-2-2 04:00
4  | 2009-2-2 06:00 | 2009-2-2 14:00

As part of a query, I need to determine if there is at least 1 watchman in a room at all times for a given time range.

So if I specified the range 2009-1-1 06:00 to 2009-1-1 12:00 , the result is true, because shifts 1 and 2 merge to cover this time period - in fact any number of shifts could be chained to keep the watch up. However if I checked 2009-2-1 22:00 to 2009-1-2 10:00, the result is false because there is a break between 4 and 6am the following morning.

I would like to implement this either in LINQ, or as a user defined function in SQL Server (2005), as in both cases this is just a part of the logic of a larger query that must be run to identify elements that need attention. The real dataset involves about a hundred shift records intersecting any given time period, but not always covering the whole range.

The closest I've found is How to group ranged values using SQL Server for number ranges, however it depends on each range ending just before the next range starts. If I could construct the same unified view of the watches, only taking overlapping watches into consideration, then it would be trivial to check if a specific time was covered. A unified view would look like this:

Start          | End
2009-1-1 06:00 | 2009-1-1 18:00
2009-2-1 20:00 | 2009-2-2 04:00
2009-2-2 06:00 | 2009-2-2 14:00

Note: This whole thing would be relatively easy to implement by just pulling all the data and running some manual loop on it, however that is the current system, and its rather slow because of the number of shifts and the number of time ranges that must be checked.

A: 

An unguarded interval obviously starts either at the end of a watched period or at the beginning of the whole time range that you are checking. So you need a query that selects all elements from this set that don't have an overlapping shift. The query would look like:

select 1 
from shifts s1 where not exists
    (select 1 from shifts s2
     where s2.start<=s1.end and s2.end > s1.end
    )
    and s1.end>=start_of_range and s1.end<  end_of_range
union
select 1 
where not exists
    (select 1 from shifts s2 
      where s2.start<=start_of_range and s2.end > start_of_range
    )

If this is non-empty, then you have an unguarded interval. I suspect it will run in quadratic time, so it might be slower than "sort, fetch and loop".

Rafał Dowgird
I'm having some trouble making this work. I assume that you intend for this to somewhere filter for the end time as well, however even adding that, or just ensuring that only valid shifts are passed (by making shifts a view), the results seem unrelated to the range provided - the same results are given regardless of if the range is inside or outside the shifts.
David
Right, the top subquery totally ignored the range - added check
Rafał Dowgird
+2  A: 
Sung Meister
what about when, for example, we have shifts: 12-2 and 4-6 and we check 1-6. that should fail, since 2-4 is empty.
nlucaroni
Code has been updated.
Sung Meister
Very nice. Ironically, because this was part of a larger, more complex problem, I found a faster way of doing the whole thing by creating a type of lookup table powered by triggers, moving the bulk of the CPU cost to when the records are created (which is much less frequently then when they need to be read and analysed). However I have to deal with these kinds of overlapping ranges a lot around here, so this is going to be very useful soon enough.
David
@David I wonder if you could actually create an *indexed view* for "FlattenedDateRange" CTE in my code instead of using a trigger. Anyways, great to hear that you were able to solve the problem through different means other than the selected answer.
Sung Meister
A: 

One way is to create a temp table with a row for each time value requiring to be checked (which is a function of the resolution of your shifts).

If it were minutes it would have 60 * 24 = 1440 rows for a day; about 10K rows for a week.

Then the SQL is relatively simple:

SELECT COUNT(1)
FROM #minutes m
LEFT JOIN shifts s ON m.checktime BETWEEN s.start_time AND s.end_time
HAVING COUNT(1) = 0

This has the benefit of also being able to show how many shifts are covering the same time.

The execution time should be negligible given the scales you've described.

le dorfier