views:

220

answers:

5

I got following question on an interview: Given a table of natural numbers with some missing ones, provide output of two tables, beginning of number gap in first table and ending in second. Example:

 ____    ________
|    |   |   |   |
| 1  |   | 3 | 3 |
| 2  |   | 6 | 7 |
| 4  |   | 10| 12|
| 5  |   |___|___|
| 8  |
| 9  |
| 13 |
|____|
+1  A: 

Something like this:

SELECT col1, col2 FROM
(
    SELECT x + 1 as col1, 
        ROW_NUMBER() OVER(ORDER BY x) AS 'rownum'  
    FROM tbl y 
    WHERE NOT EXISTS (SELECT x FROM tbl z WHERE z.x = y.x + 1) 
        AND x <> (SELECT MAX(x) FROM tbl)
) a
INNER JOIN
(
    SELECT x - 1 as col2,
        ROW_NUMBER() OVER(ORDER BY x) AS 'rownum'  
    FROM tbl y 
    WHERE NOT EXISTS (SELECT x FROM tbl z WHERE z.x = y.x - 1) 
        AND x <> (SELECT MIN(x) FROM tbl)
) b
ON a.rownum = b.rownum

The "rownum" syntax will be different for different DBMS. The above might work for SQL Server, but I haven't tested it.

As one of the comments pointed out, many DBMS's have analytics that will make this easier.

egrunin
For first and last, you could add something for MAX and MIN values of the column.
JNK
@JNK: two minds with but a single thought :)
egrunin
nice, i actually just learned a lot about SQL from this :)
KennyCason
This query produces the wrong result.
Phil Sandler
@Phil: not anymore :) and if you want a two-table solution, just remove the row-number clauses and the outer `SELECT`
egrunin
+1  A: 

This is SQL Server syntax:

CREATE TABLE #temp (columnA int)

INSERT INTO #temp VALUES(1)
INSERT INTO #temp VALUES(2)
INSERT INTO #temp VALUES(4)
INSERT INTO #temp VALUES(5)
INSERT INTO #temp VALUES(8)
INSERT INTO #temp VALUES(9)
INSERT INTO #temp VALUES(13)

SELECT 
    t1.columnA - 1
FROM 
    #temp t1
    LEFT JOIN #temp t2 ON t1.columnA = t2.ColumnA + 1
WHERE 
    t2.ColumnA IS NULL
    AND t1.ColumnA != (SELECT MIN(ColumnA) from #temp)  

SELECT 
    t1.columnA + 1
FROM 
    #temp t1
    LEFT JOIN #temp t2 ON t1.columnA = t2.ColumnA - 1
WHERE 
    t2.ColumnA IS NULL  
    AND t1.ColumnA != (SELECT MAX(ColumnA) from #temp)  

DROP table #temp
Phil Sandler
Um, how does this produce the two-column result...?
egrunin
Wow, I'm getting downvotes while a query that produces the wrong result is getting upvotes.
Phil Sandler
We came up with the same concept (but slightly different implementation), so you get an upvote from me.
stack
@egrunin: the questions says output of two tables, not two columns.
Phil Sandler
@Phil: you're right: the illustration showed one result table, the text showed two (and some of the other answers read it as it did as well). Unfortunately SO won't let me retract my downvote unless the answer gets edited. Sorry.
egrunin
I'm sorry about text/illustration contradiction, that's my fault. The question itself on the interview had that, it was written "tables", but illustrated ONE column with two values separated by comma. Anyway, thanks for the answers, it's really helpful.
Krns
A: 

Itzik Ben Gan writes a lot about these "gaps and islands" problems. His row_number solution to this is

WITH C AS
(
SELECT N, ROW_NUMBER() OVER (ORDER BY N) AS RN
FROM t
)
SELECT Cur.N+1,Nxt.N-1
FROM C AS Cur 
JOIN C AS Nxt ON Nxt.RN = Cur.RN+1
WHERE Nxt.N-Cur.N>1

And a solution without row_number from the same source.

SELECT N+1 AS start_range,
(SELECT MIN(B.N) FROM t AS B WHERE B.N > A.N)-1 AS end_range
FROM t AS A
WHERE NOT EXISTS(SELECT * FROM t AS B WHERE B.N = A.N+1)
AND N< (SELECT MAX(N) FROM t)
Martin Smith
+2  A: 

This works without DB Specific SQL and it could probably be made a little cleaner but it does work

EDIT: You can see this working at on this Query at StackExchange Data Explorer

SELECT low,high FROM 

(

SELECT col1, low 

FROM
(Select n1.col1 col1, min(n2.col1) + 1 low
 from numbers n1
inner join numbers n2
on n1.col1 < n2.col1 

Group by n1.col1) t
WHERE t.low not in (SELECT col1 FROM NUMBERS)
and t.low < (Select MAX(col1) from numbers) 
) t

INNER JOIN 
(

SELECT col1 - 1 col1, high
 FROM
(Select n1.col1 col1 , min(n2.col1) - 1 high
 from numbers n1
inner join numbers n2
on n1.col1 < n2.col1 

Group by n1.col1) t
WHERE t.high not in (SELECT col1 FROM NUMBERS) 
) t2
ON t.col1 = t2.col1
Conrad Frix
Didn't know about Data Explorer!!
egrunin
Heard about it here http://herdingcode.com/?p=263
Conrad Frix
+3  A: 

While this is pretty much the same as Phil Sandler's answer, this should return two separate tables (and I think it looks cleaner) (it works in SQL Server, at least):

DECLARE @temp TABLE (num int)
INSERT INTO @temp VALUES (1),(2),(4),(5),(8),(9),(13)

DECLARE @min INT, @max INT
SELECT @min = MIN(num), @max = MAX(num) FROM @temp

SELECT t.num + 1 AS range_start
    FROM @temp t
    LEFT JOIN @temp t2 ON t.num + 1 = t2.num
    WHERE t.num < @max AND t2.num IS NULL

SELECT t.num - 1 AS range_end
    FROM @temp t
    LEFT JOIN @temp t2 ON t.num - 1 = t2.num
    WHERE t.num > @min AND t2.num IS NULL
stack
Thanks, I guess i have a lot to learn about sql :)
Krns