ansaurus

Question

Table with identity field: Best SQL query to get Ids of deleted records?

Answer 1

+2 A:

You could use a recursive query:

DECLARE @MaxId int
SELECT @MaxId = SELECT IDENT_CURRENT('MyTable');

WITH Ids AS (
    SELECT 1 AS intId
    UNION ALL
    SELECT intId + 1
    FROM Ids
    WHERE intId < @MaxId
)
SELECT intId
FROM Ids AS i
WHERE NOT EXISTS (
    SELECT NULL FROM MyTable AS m
    WHERE m.intId = i.intId
)
OPTION (MAXRECURSION 0)

Though this won't be very efficient if the table is very large.

Paul 2010-01-18 11:19:49

+1 because this works *IF* you change the "SET @MaxId = ...." line to "SELECT @MaxId = SELECT IDENT_CURRENT('MyTable');"

AdaTheDev 2010-01-18 11:35:59

Good point, thanks. I've changed the answer accordingly

Paul 2010-01-18 16:12:12

This doesn't work for a sequence of missing Ids though.

fung 2010-01-19 03:50:29

Why doesn't it?

Paul 2010-01-19 13:05:15

Not too sure myself. Seems like it should but it didn't when I ran it against my data. Your other query did however.

fung 2010-01-21 03:06:19

Answer 2

+2 A:

A completely different way to do it is this:

SELECT a.intId, b.intId
FROM MyTable a
    CROSS JOIN MyTable b
WHERE a.intId + 1 < b.intId
    AND NOT EXISTS (
        SELECT NULL FROM MyTable c
        WHERE c.intId > a.intId
            AND c.intId < b.intId
    )

Which will give pairs of IDs between which all the records have been removed.

So if the IDs were (1, 2, 3, 6, 7, 12), it would return (3, 6) and (7, 12).

EDIT:

This is very inefficient if the table is large. The following method is much better:

SELECT g.intStartId, MIN(t.intId) AS intEndId
FROM (
    SELECT intId AS intStartId
    FROM MyTable AS a
    WHERE NOT EXISTS (
        SELECT NULL FROM MyTable AS b
        WHERE b.intId = a.intId + 1
    )
) AS g
    CROSS JOIN MyTable AS t
WHERE t.intId > g.intStartId
GROUP BY g.intStartId

So we first find IDs that mark the start of a gap, and then we find the lowest ID we have greater than each to mark the end of the gap.

Paul 2010-01-18 11:31:25

This actually works quite well but takes really long to complete due to the clause 'WHERE c.intId > a.intId AND c.intId < b.intId'. 93% in the execution plan for my case. Don't suppose there is any way to optimize that.

fung 2010-01-19 05:55:33

See the edit above.

Paul 2010-01-19 13:30:40

This works but you should be aware that it wont get just 'DELETED' records but failed inserts and rolled back transactions also can increment the identity count.

keithwarren7 2010-01-19 13:32:50

Yes, this is true. I believe the only way of getting only deleted rows is to record the fact at the point of deletion.

Paul 2010-01-19 14:25:32

I'm not looking for deleted records per se. More appropriately the missing records so this works.

fung 2010-01-20 01:59:24

Answer 3

+1 A:

One option is to create a temporary table/embeddable SQL statement containing all possible IDs (some options are outlined in this article) up to an including the max(identity) of your table.

Then you can left join from this canonical list of values with your table and filter on a null values on the right side.

Unsliced 2010-01-18 11:31:31

This was the plausible way I was thinking of but seems resource intensive to me. And hoping to not have to muddle with temp tables.

fung 2010-01-18 11:34:52

you have to be able to identify what you don't have, and given that the series of missing items are of indeterminate length, you can't accomodate them with arbitrary joins. @Paul offers a neatish solution which you could unpack - but you need either to loop or construct alternatives, which will be expensive in either case.

Unsliced 2010-01-18 12:00:21

Answer 4

+1 A:

A query focusing on sequential ID column is not enough. The ID sequence may skip numbers during insert if transaction fails, so if you have ID = (1,3) it does not mean that ID=2 was deleted, it may have been skipped. You have to use something to catch deleted records, like trigger, or OUTPUT DELETED.* -- or use something to compare against, like a snapshot, a backup, a history table.

Damir Sudarevic 2010-01-18 11:32:17

Answer 5

A:

A completely different approach, which does require a refactoring of your example, would be to not delete from the table, but to have a separate table of deleted item ids (or to have a field in the table showing the row's status). So you could then just select on those data instead. (This would wrap around @Damir's observation that you cannot differentiate between deletions and insert errors.)

Unsliced 2010-01-18 12:03:10

Answer 6

+2 A:

Left join with a numbers table and grab all the ones where it is null, this uses the built in numbers table but it is better to have your own

example of what the code would look like

create table #bla(id int)

insert #bla values(1)
insert #bla values(2)
insert #bla values(4)
insert #bla values(5)
insert #bla values(9)
insert #bla values(12)



select number from master..spt_values s
left join #bla b on s.number = b.id
where s.type='P'
and s.number < (select MAX(id) from #bla)
and  b.id is null

output

0 3 6 7 8 10 11

See here: How to return all the skipped identity values from a table in SQL Server for some more detail

SQLMenace 2010-01-18 14:25:32

In the end this looks like the fastest method =)

fung 2010-01-22 01:55:05

ansaurus

tags:

views:

answers:

Table with identity field: Best SQL query to get Ids of deleted records?

related questions