After reading a lot of articles and many answers related to the above subject, I am still wondering how the SQL Server database engine works in the following example:
Let's assume that we have a table named t3:
create table t3 (a int , b int);
create index test on t3 (a);
and a query as follow:
INSERT INTO T3
SELECT -86,-86
WHERE NOT EXISTS (SELECT 1 FROM t3 where t3.a=-86);
The query inserts a line in the table t3 after verifying that the row does not already exist based on the column "a".
Many articles and answers indicate that using the above query there is no way that a row will be inserted twice.
For the execution of the above query, I assume that the database engine works as follow:
- The subquery is executed first.
- The database engine sets a shared(s) lock on a range.
- The data is read.
- The shared lock is released. According to MSDN a shared lock is released as soon as the data has been read.
- If a row does not exist it inserts a new line in the table.
- The new line is locked with an exclusive lock (x)
Now consider the following scenario:
- The above query is executed by processor A (SPID 1).
- The same query is executed by a processor B (SPID 2).
- [SPID 1] The database engine sets a shared(s) lock
- [SPID 1] The subquery reads the data. Now rows are returned.
- [SPID 1] The shared lock is released.
- [SPID 2] The database engine sets a shared(s) lock
- [SPID 2] The subquery reads the data. No rows are return.
- [SPID 2] The shared lock is released.
- Both processes proceed with a row insertion (and we get a duplicate entry).
Am I missing something? Is the above way a correct way for avoiding duplicate entries?
A safe way to avoid duplicate entries is using the code below, but I am just wondering whether the above method is correct.
begin tran
if (SELECT 1 FROM t3 with (updlock) where t3.a=-86)
begin
INSERT INTO T3
SELECT -86,-86
end
commit