views:

51

answers:

4

I know there are so many overlapping query questions already been answered but none of them were able to resolve the issue I'm having:

We need to find out the total number of hours between the lowest start_date and highest end_date, considering the overlapping range as well.

Start_Date      End_Date
3/5/2010 11:27  3/5/2010 13:04 -  Need to include
3/5/2010 11:29  3/5/2010 11:55 -  Can exclude ( overlapping)
3/5/2010 13:13  3/5/2010 13:37 -  Need to include
3/5/2010 13:13  3/5/2010 13:37 -  Duplicate
3/5/2010 14:55  3/5/2010 15:22 -  Need to include
3/5/2010 14:55  3/5/2010 15:22 -  Duplicate
3/5/2010 15:15  3/5/2010 17:45 -  Overlapping with above since it starts at 15.15, 7 minutes before the previous end_date

thanks in advance John D

enter code here
+1  A: 
SELECT DATEDIFF(hour, MIN(Start_Date), MAX(End_Date)) AS [Diff]
  FROM [table]

MSDN datediff page: http://msdn.microsoft.com/en-us/library/aa258269%28SQL.80%29.aspx

zerkms
thanks...but we need to consider the overlapping range as well. The above code will not yield the correct result.
John D
Then I cannot get what you actually mean :-S
zerkms
First row difference 97 minutes
John D
"First row difference" ? There's only one row in the result. And it's in hours, as originally asked, not minutes like you're now suggesting. Basically it calculates `DATEDIFF(hour, 11:27, 17:45)` and therefore overlapping ranges are counted only once.
MSalters
A: 

Being inspired by this answer about flattening date ranges, I came up with this:

-- Required function to flatten the time ranges
-- Assumes the end date is after the start.
CREATE FUNCTION Ranges(@startOfRange DATETIME, @endOfRange DATETIME)
RETURNS @rangesTable TABLE (
    [start] DATETIME NOT NULL,
    [end] DATETIME NOT NULL
    )
AS
BEGIN
    DECLARE @startTime DATETIME
    DECLARE @endTime DATETIME
    DECLARE @m_start DATETIME
    DECLARE @m_end DATETIME

    DECLARE cursorSpans CURSOR FAST_FORWARD FOR
        SELECT DISTINCT [Start_Time], [End_Time]
        FROM [YOUR_TABLE]
        ORDER BY [Start_Time], [End_Time]

    OPEN cursorSpans

    FETCH NEXT FROM cursorSpans INTO @startTime, @endTime

    SET @m_start = @startTime
    SET @m_end = @endTime

    WHILE @@FETCH_STATUS = 0
    BEGIN
        FETCH NEXT FROM cursorSpans INTO @startTime, @endTime

        IF @startTime > @m_end
        BEGIN
            -- Only insert a new record if the current date range does not overlap the previous one.
            INSERT INTO @rangesTable
            VALUES(@m_start, @m_end)

            SET @m_start = @startTime
        END
        SET @m_end = CASE WHEN @endTime > @m_end THEN @endTime ELSE @m_end END
    END

    -- Handle the final date range
    IF @m_start IS NOT NULL
    BEGIN
        INSERT INTO @rangesTable
        VALUES(@m_start, @m_end)
    END

    CLOSE cursorSpans
    DEALLOCATE cursorSpans

    RETURN
END
GO

-- Get the total minute difference between the start and end of each range, then total them up.
SELECT SUM(DATEDIFF(MINUTE, [Start], [End]))
FROM Ranges('2010-03-05 11:00:00', '2010-03-05 18:00:00')
GO
Agent_9191
+1  A: 

Well that was an interesting and fun problem to work on. +1

So lets start up with a setup:

CREATE TABLE d (start_dt datetime, end_dt datetime)
INSERT INTO d VALUES ('3/5/2010 11:27', '3/5/2010 13:04')
INSERT INTO d VALUES ('3/5/2010 11:29', '3/5/2010 11:55')
INSERT INTO d VALUES ('3/5/2010 13:13', '3/5/2010 13:37')
INSERT INTO d VALUES ('3/5/2010 13:13', '3/5/2010 13:37')
INSERT INTO d VALUES ('3/5/2010 14:55', '3/5/2010 15:22')
INSERT INTO d VALUES ('3/5/2010 14:55', '3/5/2010 15:22')
INSERT INTO d VALUES ('3/5/2010 15:15', '3/5/2010 17:45')

Here is the code to actually calculate it, all via TSQL:

-- Create a cursor that we will use to loop over all the items in the base table
DECLARE cur CURSOR FOR SELECT start_dt, end_dt FROM d
-- Temp variables to store each rows data
DECLARE @start_dt datetime, @end_dt datetime
-- Temp table to hold the "adjusted" rows
DECLARE @d TABLE (id int identity, start_dt datetime, end_dt datetime)
OPEN cur
FETCH NEXT FROM cur INTO @start_dt, @end_dt
WHILE @@FETCH_STATUS = 0
BEGIN
    -- Start by deleting any rows contained entirely within the current rows timeframe
    DELETE FROM @d WHERE start_dt BETWEEN @start_dt AND @end_dt AND end_dt BETWEEN @start_dt     AND @end_dt

    DECLARE @id_start_in int = -1, @id_end_in int = -1
    SELECT @id_start_in = id FROM @d WHERE @start_dt BETWEEN start_dt AND end_dt
    SELECT @id_end_in = id FROM @d WHERE @end_dt BETWEEN start_dt AND end_dt

    -- If our start and end dates are not contained in any other set, add it as a new row
    IF (@id_start_in = -1 AND @id_end_in = -1)
        INSERT INTO @d (start_dt, end_dt) VALUES (@start_dt, @end_dt)

    -- If our start date and end dates are both contained in the same row, ignore because we are overlapping that row

    -- If our start date and end dates are in two different rows, we combine those two
    -- For example if there are 3 rows, 1-3, 2-5, 4-6, we actually have full coverage from 1-6
    IF (@id_start_in != @id_end_in AND @id_start_in != -1 AND @id_end_in != -1)
    BEGIN
        -- Expand the start row to end at the same time the row our end time is in
        UPDATE @d SET end_dt = (SELECT end_dt FROM @d WHERE id = @id_end_in) WHERE id = @id_start_in
        -- Delete the row our end time is in
        DELETE FROM @d WHERE id = @id_end_in
    END

    -- If our start date is contained in a row but our end date isnt, extend the existing row
    -- to end at our end date
    IF (@id_start_in != -1 AND @id_end_in = -1)
        UPDATE @d SET end_dt = @end_dt WHERE id = @id_start_in

    -- If our end date is contained in a row but our start date isnt, extend the existing row
    -- to start at our start time
    IF (@id_start_in = -1 AND @id_end_in != -1)
        UPDATE @d SET start_dt = @start_dt WHERE id = @id_end_in

    FETCH NEXT FROM cur INTO @start_dt, @end_dt
END
CLOSE cur
DEALLOCATE cur

-- Show the end table
SELECT start_dt, end_dt, DATEDIFF(MINUTE, start_dt, end_dt) FROM @d
-- Sum up to get the minutes and calculate the hours
SELECT SUM(DATEDIFF(MINUTE, start_dt, end_dt)) AS MINUTES, CAST(SUM(DATEDIFF(MINUTE, start_dt, end_dt)) AS DECIMAL) / 60 AS HOURS FROM @d
Jeff Wight
A: 

Using the table that Jeff Wright provided:

CREATE TABLE d (start_dt datetime, end_dt datetime)
INSERT INTO d VALUES ('3/5/2010 11:27', '3/5/2010 13:04')
INSERT INTO d VALUES ('3/5/2010 11:29', '3/5/2010 11:55')
INSERT INTO d VALUES ('3/5/2010 13:13', '3/5/2010 13:37')
INSERT INTO d VALUES ('3/5/2010 13:13', '3/5/2010 13:37')
INSERT INTO d VALUES ('3/5/2010 14:55', '3/5/2010 15:22')
INSERT INTO d VALUES ('3/5/2010 14:55', '3/5/2010 15:22')
INSERT INTO d VALUES ('3/5/2010 15:15', '3/5/2010 17:45')

It's time to have fun with Common Table Expressions:

with ranges as (
    select d.start_dt,d.end_dt from d
    union all
    select r1.start_dt,r2.end_dt from d r1 inner join ranges r2 on r1.start_dt < r2.start_dt and r1.end_dt >= r2.start_dt
    union all
    select r1.start_dt,r2.end_dt from ranges r1 inner join d r2 on r2.end_dt > r1.end_dt and r2.start_dt <= r1.end_dt
), maxrange as (
    select start_dt,MAX(end_dt) as end_dt from ranges group by start_dt
), minmaxrange as (
    select MIN(start_dt) as start_dt,end_dt from maxrange group by end_dt
)
select SUM(DATEDIFF(MINUTE,start_dt,end_dt)) from minmaxrange

The ranges CTE finds all of the overlapping periods. The maxrange CTE then finds the latest end_dt for a particular start_dt, and the minmaxrange finds the earliest start_dt for a particular end_dt. Together, these two eliminate overlaps and duplicates. Finally, we ask for the sum of the differences, in minutes. Result is 291 minutes.

Damien_The_Unbeliever