tags:

views:

305

answers:

3

I have table with a unique auto-incremental primary key. Over time, entries may be deleted from the table, so there are "holes" in this field's values. For example, table data may be as follows:

 ID  | Value    | More fields...
---------------------------------
 2   | Cat      | ... 
 3   | Fish     | ...
 6   | Dog      | ...
 7   | Aardvark | ...
 9   | Owl      | ...
 10  | Pig      | ...
 11  | Badger   | ...
 15  | Mongoose | ...
 19  | Ferret   | ...

I'm interested in a query that will return the list of missing IDs in the table. For the data above, the expected results are:

 ID 
----
 1
 4
 5
 8
 12
 13
 14
 16
 17
 18

Notes:

  1. It is assumed that the initial first ID was 1
  2. The maximum ID that should be examined is the final one, i.e. it's okay to assume that there were no additional entries after the current last one (see additional data on this point below)

A drawback of the above requirements is that the list will not return IDs that were created after ID 19 and that were deleted. I'm currently solving this case in code, because I hold the max ID created. However, if the query can take as a parameter MaxID, and also return those IDs between the current max and MaxID, that would be a nice "bonus" (but certainly not a must).

I'm currently working with MySQL, but consider moving to SQL Server, so I would like the query to fit both. Also, if you are using anything that can't run on SQLite, please mention it, thanks.

+3  A: 

This question often comes up, and sadly, the most common (and most portable) answer is to create a temporary table to hold the IDs that should be there, and do a left join. The syntax is pretty similar between MySQL and SQL Server. The only real difference is the temporary tables syntax.

In MySQL:

declare @id int
declare @maxid int

set @id = 1
select @maxid = max(id) from tbl

create temporary table IDSeq
(
    id int
)

while @id < @maxid
begin
    insert into IDSeq values(@id)

    set @id = @id + 1
end

select 
    s.id 
from 
    idseq s 
    left join tbl t on 
        s.id = t.id 
 where t.id is null

 drop table IDSeq

In SQL Server:

declare @id int
declare @maxid int

set @id = 1
select @maxid = max(id) from tbl

create table #IDSeq
(
    id int
)

while @id < @maxid --whatever you max is
begin
    insert into #IDSeq values(@id)

    set @id = @id + 1
end

select 
    s.id 
from 
    #idseq s 
    left join tbl t on 
        s.id = t.id 
 where t.id is null

 drop table #IDSeq
Eric
Im not sure what the scenario is in this environment, but what if it were not 20, but say a thousand records? ..and this were called by code on a web page serving 50-60 users concurrently. would it be efficient to create and drop those records everytime? Considering we leave out the part where we create and drop the temp table.
@daemonkid: Man, what a freaking strawman. If you had to figure this out time and time again, for 50-60 users, you'd obviously want a permanent table. You obviously have to adapt to your specific scenario, but this is a solution to the problem of finding the missing IDs.
Eric
+1 I'm not sure I'll go with it, but I'll consider. Thanks Eric.
Roee Adler
+4  A: 

Here's the query for SQL Server:

;WITH Missing (missnum, maxid)
AS
(
 SELECT 1 AS missnum, (select max(id) from @TT)
 UNION ALL
 SELECT missnum + 1, maxid FROM Missing
 WHERE missnum < maxid
)
SELECT missnum
FROM Missing
LEFT OUTER JOIN @TT tt on tt.id = Missing.missnum
WHERE tt.id is NULL
OPTION (MAXRECURSION 0);

Hope this is helpful.

StarWind Software
What would be the comparable query on MySQL?
Roee Adler
+1  A: 

This is an Oracle only solution. It doesn't address the full question, but is left here for others that may be using Oracle.

select level id           -- generate 1 .. 19
from dual
connect by level <= 19

minus                     -- remove from that set

select id                 -- everything that is currently in the 
from table                -- actual table
EvilTeach