tags:

views:

1485

answers:

4

I have a table that has, in essence, this structure:

 key        value
 ------     ------
 2          val1
 3          val2
 5          val3

The keys are sequential integers from 1 up to (currently) 1 million, increasing by several thousand each day. Gaps in the keys occur when records have been deleted.

I'm looking for an SQL query that returns this:

 key        value
 ------     ------
 1
 2          val1
 3          val2
 4 
 5          val3

I can see how to do this with joining to a second table that has a complete list of keys. However I'd prefer a solution that uses standard SQL (no stored procedures or a second table of keys), and that will work no matter what the upper value of the key is.

+3  A: 

SQL queries have no looping mechanism. Procedure languages have loops, but queries themselves can only "loop" over data that they find in a table (or a derived table).

What I do to generate a list of numbers on the fly is to do a cross-join on a small table of digits 0 through 9:

CREATE TABLE n (d NUMERIC);
INSERT INTO n VALUES (0), (1), (2), (3), (4), (5), (6), (7), (8), (9);

Then to generate 00..99:

SELECT n1.d + n2.d*10 AS d
FROM n AS n1 CROSS JOIN n AS n10;

If you want only 00..57:

SELECT n1.d + n2.d*10 AS d
FROM n AS n1 CROSS JOIN n AS n2
WHERE n1.d + n2.d*10 <= 57;

You can of course join the table for the 100's place, 1000's place, etc. Note that you can't use column aliases in the WHERE clause, so you have to repeat the full expression.

Now you can use this as a derived table in a FROM clause and join it to your data table.

SELECT n0.d, mytable.value
FROM
   (SELECT n1.d + n2.d*10 + n2.d*100 + n3.d*1000 
      + n4.d*10000 + n5.d*100000 AS d
    FROM n AS n1 CROSS JOIN n AS n2 CROSS JOIN n AS n3 
      CROSS JOIN n AS n4 CROSS JOIN n AS n5) AS n0
  LEFT OUTER JOIN mytable ON (n0.d = mytable.key)
WHERE n0.d <= (SELECT MAX(key) FROM mytable);

You do need to add another CROSS JOIN each time your table exceeds an order of magnitude in size. E.g. when it grows past 1 million, add a join for n6.

Note also we can now use the column alias in the WHERE clause of the outer query.

Admittedly, it can be a pretty expensive query to do this solely in SQL. You might find that it's both simpler and speedier to "fill in the gaps" by writing some application code.

Bill Karwin
A: 

In MySQL you can find the edges of the gaps by performing left joins against itself with positive and negative offsets.

Eg:

create table seq ( i int primary key, v varchar(10) );

insert into seq values( 2, 'val1' ), (3, 'val2' ), (5, 'val3' );


select s.i-1 from seq s left join seq m on m.i = (s.i -1) where m.i is null;

+-------+
| s.i-1 |
+-------+
|     1 |
|     4 |
+-------+


select s.i+1 from seq s left join seq m on m.i = (s.i +1) where m.i is null;
+-------+
| s.i+1 |
+-------+
|     4 |
|     6 |
+-------+

This doesn't give you exactly want you want, but gives enough information to work out what the missing rows are.

Martin
A: 
WITH range (num) AS (
SELECT 1 -- use your own lowerbound
UNION ALL
SELECT 1 + num FROM range
WHERE num < 10 -- use your own upper bound
)
SELECT r.num, y.* FROM range r left join yourtable y
on r.num = y.id
Haoest
This will only work for num < 100. SQL Server has a limit of 100 recursive calls.
Jonas Lincoln
The OP said he wanted a standard SQL solution, but this solution uses Microsoft/Sybase features that are not standard SQL.
Bill Karwin
then I would probably use a cursor to generate the range, if it's standard.
Haoest
A: 

Another method would be to create a resultset of the million numbers, and use it as a basis for the join. That might do the job for you. (stolen from ASKTOMs Blog)

select  level
from    dual
connect by level <= 1000000

yielding something like this

WITH 
upper_limit AS
(
    select 1000000 limit from dual
),
fake_table AS
(
    select  level key
    from    dual
    connect by level <= (select limit from upper_limit)
)
select key, value
from table, fake_table
where fake_table.key = table.key(+)

I'm not at work, so I can't test this. Your mileage may vary. I use Oracle at work.

EvilTeach
Lots of Oracle specific syntax in this (WITH, dual, CONNECT BY, (+)) but the OP said he wanted a standard SQL solution.
Bill Karwin
Hence the Your mileage may vary.
EvilTeach