views:

75

answers:

1

I need to collapse multiple ranges of sequential numbers (1 or more) to sets of their minimum and maximum values. I have unique integers (no duplicates) stored in a table column.

The obvious way (to me) to solve this problem is to use a cursor (see my algorithm below) and iterate through every integer. However, it seems inefficient to me so I am wondering if there is a more efficient algorithm. Perhaps there is a way using common table expressions with recursion. I have more than 32767 integers though, so any solution will need to use option (MAXRECURSION 0) which sets unlimited recursion.

Following is a simplified test case for my existing algorithm usign a cursor. It will output the minimum and maximum for each range of sequential numbers (e.g. 1-3, 9-11, 13-13, 15-16).

I am using MS SQL Server 2008. Please note comments begin with two dashes (--).

declare @minInt int, @maxInt int
declare @nextInt int, @prevInt int
--need a temporary table to store the ranges that were found
declare @rangeTable table (minInt int, maxInt int)
declare mycursor cursor for
select * from
(
    select 1 as id  union
    select 2 as id  union
    select 3 as id  union
    select 9 as id  union
    select 10 as id union
    select 11 as id union
    select 13 as id union
    select 15 as id union
    select 16 as id
) tblRanges
order by id--order is needed for this algorithm if used with generic data
open mycursor
--initialise new sequence
fetch next from mycursor into @minInt
select @maxInt = @minInt--set the min and max to the smallest value
select @prevInt = @minInt--store the last int
declare @sequenceFound int
while @@FETCH_STATUS=0
begin

    select @sequenceFound=1--set the default flag value to true
    --loop while sequence found
    while @@FETCH_STATUS=0 and @sequenceFound = 1
    begin

        fetch next from mycursor into @nextInt
        if @nextInt = (@prevInt + 1)
        begin
            select @sequenceFound = 1
        end
        else
        begin
            select @sequenceFound = 0
        end
        select @prevInt = @nextInt--store the current value as the previous value for the next comparison
        if @sequenceFound = 1 --if the nextInt is part of a sequence, then store the new maxInt
            and @maxInt < @nextInt--should always be true for ordered output containing no duplicates
        begin
            select @maxInt = @nextInt
        end

    end--while sequenceFound
    --store the sequence range and then check for more sequences
    insert into @rangeTable (minInt,maxInt) values (@minInt,@maxInt)
    --store the current value as the new minInt and maxInt for the next sequence iteration
    select @minInt = @nextInt
    select @maxInt = @nextInt
end--while more table rows found
select * from @rangeTable

close mycursor
deallocate mycursor
+2  A: 

Courtesy of Itzik Ben-Gan:

WITH tblRanges AS
( 
    SELECT 1 AS ID  UNION 
    SELECT 2 AS ID  UNION 
    SELECT 3 AS ID  UNION 
    SELECT 9 AS ID  UNION 
    SELECT 10 AS ID UNION 
    SELECT 11 AS ID UNION 
    SELECT 13 AS ID UNION 
    SELECT 15 AS ID UNION 
    SELECT 16 AS ID 
),
StartingPoints AS
(
SELECT ID, ROW_NUMBER() OVER(ORDER BY ID) AS rownum
FROM tblRanges AS A
WHERE NOT EXISTS
(SELECT *
FROM tblRanges AS B
WHERE B.ID = A.ID - 1)
),
EndingPoints AS
(
SELECT ID, ROW_NUMBER() OVER(ORDER BY ID) AS rownum
FROM tblRanges AS A
WHERE NOT EXISTS
(SELECT *
FROM tblRanges AS B
WHERE B.ID = A.ID + 1)
)
SELECT S.ID AS start_range, E.ID AS end_range
FROM StartingPoints AS S
JOIN EndingPoints AS E
ON E.rownum = S.rownum;

You can read a full explanation from his chapter in SQL Sever MVP Deep Dives called Gaps and Islands. He explains various techniques (including cursors) and compares them in terms of performance.

8kb
Thank you. Your code works perfectly with the sample data and my real data. I really like how you have solved the problem without using a cursor.
Jeremy Larter