views:

207

answers:

3

In MSSQL I have a set of tasks which have start and end times.

What I wish to do is collapse sequential tasks together.
So my definition of sequential is TaskEndDate equals start of next TaskStartDate, there is no gap in time.

In the below dataset, 21:00 to 21:40 is one sequence, then 22:00 to 22:20 & 23:20 to 00:00.

TaskStartDate             TaskEndDate
2008-09-01 21:00:00.000 2008-09-01 21:05:00.000
2008-09-01 21:05:00.000 2008-09-01 21:10:00.000
2008-09-01 21:10:00.000 2008-09-01 21:15:00.000
2008-09-01 21:15:00.000 2008-09-01 21:20:00.000
2008-09-01 21:20:00.000 2008-09-01 21:25:00.000
2008-09-01 21:25:00.000 2008-09-01 21:30:00.000
2008-09-01 21:30:00.000 2008-09-01 21:35:00.000
2008-09-01 21:35:00.000 2008-09-01 21:40:00.000
2008-09-01 22:00:00.000 2008-09-01 22:05:00.000
2008-09-01 22:05:00.000 2008-09-01 22:10:00.000
2008-09-01 22:10:00.000 2008-09-01 22:15:00.000
2008-09-01 22:15:00.000 2008-09-01 22:20:00.000
2008-09-01 23:20:00.000 2008-09-01 23:25:00.000
2008-09-01 23:25:00.000 2008-09-01 23:30:00.000
2008-09-01 23:30:00.000 2008-09-01 23:35:00.000
2008-09-01 23:35:00.000 2008-09-01 23:40:00.000
2008-09-01 23:40:00.000 2008-09-01 23:45:00.000
2008-09-01 23:45:00.000 2008-09-01 23:50:00.000
2008-09-01 23:50:00.000 2008-09-01 23:55:00.000
2008-09-01 23:55:00.000 2008-09-02 00:00:00.000

Feel free to use CTE's or other MSSQL specific features.

+3  A: 

Assuming no overlaps of duplicates, this should do it:

;WITH cteStart As (
    SELECT TaskStartDate,
        ROW_NUMBER() OVER(ORDER BY TaskStartDate) as N
    FROM YourTable y
    WHERE TaskStartDate NOT IN(SELECT TaskEndDate FROM YourTable y1)
), cteEnd As (
    SELECT TaskEndDate,
        ROW_NUMBER() OVER(ORDER BY TaskEndDate) as N
    FROM YourTable y
    WHERE TaskEndDate NOT IN(SELECT TaskStartDate FROM YourTable y1)
)
SELECT TaskStartDate, TaskEndDate
FROM cteStart as s
    JOIN cteEnd as e ON e.N = s.N

edit: changed 2nd "TaskStartDate" to "TaskEndDate" on the last Select.

RBarryYoung
Bottom select should read:SELECT TaskStartDate, TaskEndDateFROM cteStart as s JOIN cteEnd as e ON e.N = s.N
Jafin
Right, good catch. Will correct.
RBarryYoung
A: 

Hi Jafin

It aint pretty... but here's some SQL that seems to work. Simply replace [Tasks] with your table name.

SET NOCOUNT ON

DECLARE @date DATETIME
DECLARE @continueLoop INT
DECLARE @continueInnerLoop INT

DECLARE @tmp TABLE (
    [Start] [DateTime] NOT NULL , 
    [End] [DateTime] NOT NULL
)

SET @continueLoop = 1

WHILE @continueLoop <> 0 BEGIN
    INSERT INTO @tmp
    SELECT TOP 1 [TaskStartDate], [TaskEndDate]
    FROM [dbo].[Tasks]
    WHERE [TaskStartDate] > ISNULL((SELECT TOP 1 [End] FROM @tmp ORDER BY [End] DESC), '19000101')
    SET @continueInnerLoop = @@ROWCOUNT

    WHILE @continueInnerLoop <> 0 BEGIN
     UPDATE @tmp
     SET [End] = Tasks.[TaskEndDate]
     FROM @tmp, [dbo].[Tasks] 
     WHERE [End] = Tasks.[TaskStartDate]
    SET @continueInnerLoop = @@ROWCOUNT
    END

    SELECT @continueLoop = COUNT(*)
    FROM [dbo].[Tasks]
    WHERE [TaskStartDate] > ISNULL((SELECT TOP 1 [End] FROM @tmp ORDER BY [End] DESC), '19000101')
END

SELECT * FROM @tmp

The result produces the following results

[TaskStartDate], [TaskEndDate]
2008-09-01 21:00:00.000, 2008-09-01 21:40:00.000
2008-09-01 22:00:00.000, 2008-09-01 22:20:00.000
2008-09-01 23:20:00.000, 2008-09-02 00:00:00.000
Kane
A: 

Alternatively your can use this solution:

http://sqlblog.com/blogs/alexander_kuznetsov/archive/2009/03/08/storing-intervals-of-time-with-no-overlaps.aspx

and your query will be trivial

AlexKuznetsov
Great article Alex
Kane