views:

2443

answers:

6

I have the following set of dates (dd/MM/yyyy) matching events in my database:

eventId     startDate     endDate
1           02/05/2009    10/05/2009
2           08/05/2009    12/05/2009
3           10/05/2009    12/05/2009
4           21/05/2009    21/05/2009
5           25/05/2009    NULL
6           01/06/2009    03/06/2009

The events have a start and end date (times don't matter) and a NULL endDate means that the event is still in progress.

What I would like to determine is the ranges of dates between two arbitrary dates where there was a) no event and b) the events were overlapping.

So for an input date range of 01/04/2009 - 30/06/2009 I would expect to have the following results:

no event: 01/04/2009 - 01/05/2009
overlap : 08/05/2009 - 10/05/2009
overlap : 10/05/2009 - 12/05/2009
no event: 13/05/2009 - 20/05/2009
no event: 22/05/2009 - 24/05/2009
overlap : 01/06/2009 - 03/06/2009

Note that the two adjacent overlap ranges would be acceptable as one result.

Can anyone please help me with a SQL algorithm to generate this result set?

EDIT: The target platform database is SQL Server 2005. The dates are recorded as 10/05/2009 00:00:00, meaning that the event ended some time between 10/5/2009 00:00:00 and 10/5/2009 23:59:59. The same is true for the start dates. The input date range therefore also can be read as 01/04/2009 00:00:00 - 30/06/2009 23:59:59.

+1  A: 

I can't add a comment yet (insufficient reputation), but could you confirm whether the range is effectively from StartDate = 02/05/2009 00:00 to 10/05/2009 23:59:59 or if the end date would be 10/5/2009 00:00:00

i.e. is that end date inclusive or not - from the data it looks like the former

Joel Mansford
The end date will be recorded as 10/05/2009 00:00:00, meaning that the event ended some time between 10/5/2009 00:00:00 and 10/5/2009 23:59:59. The same is true for the start dates.
Liam
A: 

Without really understanding what problem you're trying to solve, here's my solution to some problem off the top of my head:

  1. Create table function (UDF) "all-dates" that would return all dates in a year.
  2. Convert your events to separate dates (one event row would become as many rows as there are days in it) by inner-joining events to the all-dates where the date is between event's start and end dates... Retain original eventId.
  3. Do an outer join of event-dates with all-dates (again) to find the gaps or misses.
  4. Join event-dates with themselves on where dates are same but eventId is not to find overlaps.
zvolkov
+1  A: 

I think you're going to want some variation on this article.

Eric H.
+2  A: 

It's a little variation of the function to flatten intersecting timespans in SQL Server:

It's one of the rare cases when cursor-based approach in SQL Server is faster the a set-based one:


CREATE FUNCTION mytable(@p_from DATETIME, @p_till DATETIME)
RETURNS @t TABLE
        (
        q_type VARCHAR(20) NOT NULL,
        q_start DATETIME NOT NULL,
        q_end DATETIME NOT NULL
        )
AS
BEGIN
        DECLARE @qs DATETIME
        DECLARE @qe DATETIME
        DECLARE @ms DATETIME
        DECLARE @me DATETIME
        DECLARE cr_span CURSOR FAST_FORWARD
        FOR
        SELECT  startDate, endDate
        FROM    mytable
        WHERE   startDate BETWEEN @p_from AND @p_till
        ORDER BY
                startDate 
        OPEN    cr_span
        FETCH   NEXT
        FROM    cr_span
        INTO    @qs, @qe
        SET @ms = @qs
        SET @me = @qe
        WHILE @@FETCH_STATUS = 0
        BEGIN
                FETCH   NEXT
                FROM    cr_span
                INTO    @qs, @qe
                IF @qs > @me
                BEGIN
                        INSERT
                        INTO    @t
                        VALUES ('overlap', @ms, @me)
                        INSERT
                        INTO    @t
                        VALUES ('gap', @me, @qs)
                        SET @ms = @qs
                END
                SET @me = CASE WHEN @qe > @me THEN @qe ELSE @me END
        END
        IF @ms IS NOT NULL
        BEGIN
                INSERT
                INTO    @t
                VALUES  (@ms, @me)
        END
        CLOSE   cr_span
        RETURN
END
GO

This function compresses each contiguous set of intersecting ranges into one range, and returns both the range and the following gap.

Quassnoi
A: 

Is the reply by Quassnoi a solution to the problem? if so what is the name of the data table in the function.

A: 

my poor, with postgresql, you could do that simply :

(start1, end1) OVERLAPS (start2, end2) (start1, length1) OVERLAPS (start2, length2)

This expression yields true when two time periods (defined by their endpoints) overlap, false when they do not overlap. The endpoints can be specified as pairs of dates, times, or time stamps; or as a date, time, or time stamp followed by an interval.

SELECT (DATE '2001-02-16', DATE '2001-12-21') OVERLAPS (DATE '2001-10-30', DATE '2002-10-30'); Result: true SELECT (DATE '2001-02-16', INTERVAL '100 days') OVERLAPS (DATE '2001-10-30', DATE '2002-10-30'); Result: false

but under sql server, i don't know... sorry