ansaurus

Question

In tsql is an Insert with a Select statement safe in terms of concurrency?

Answer 1

+1 A:

I think you're assumption is incorrect. When you query the VersionNumber table, you are only putting a read lock on the row. This does not prevent other users from reading the same row from the same table. Therefore, it is possible for two processes to read the same row in the VersionNumber table at the same time and generate the same VersionNumber value.

Randy Minder 2010-01-03 12:10:06

Sorry Randy, I had a typo in my sql. The insert and the select are on the same table. Does that change your answer?

David Hall 2010-01-03 12:13:57

@David - My answer doesn't change. The row being read from still will only have a read lock on it, not a write lock. Therefore, it is still possible two process could read the same row at the same time, and generate the same ID value. To make this safe, I believe you need to set the isolation level to Serializable (or Repeatable Read), to prevent this from happening.

Randy Minder 2010-01-03 13:10:35

Answer 2

+3 A:

The default isolation of read commited makes this unsafe, if two of these run in perfect paralel you will get a duplicate since there is no read lock applied.

You need REPEATABLE READ or SERIALIZABLE isolation levels to make it safe.

Paul Creasey 2010-01-03 12:17:29

The insert can't run in parallell with the select, because it uses data from the select

Andomar 2010-01-03 12:24:04

I mean two users concurrently executing the query.

Paul Creasey 2010-01-03 12:30:01

`SERIALIZABLE` works, but `REPEATABLE READ` is definitely not enough (still causes duplicates in my test code).

Heinzi 2010-01-03 14:04:31

Answer 3

A:

You need a unique constraint on (Id, VersionNumber) to enforce it
I'd use ROWLOCK, XLOCK hints to block other folk reading the locked row where you calculate
or wrap the INSERT in a TRY/CATCH. If I get a duplicate, try again...

gbn 2010-01-03 12:59:43

Answer 4

+8 A:

As Paul writes: No, it's not safe, for which I would like to add empirical evidence: Create a table Table_1 with one field ID and one record with value 0. Then execute the following code simultaneously in two Management Studio query windows:

declare @counter int
set @counter = 0
while @counter < 1000
begin
  set @counter = @counter + 1

  INSERT INTO Table_1
    SELECT MAX(ID) + 1 FROM Table_1 

end

Then execute

SELECT ID, COUNT(*) FROM Table_1 GROUP BY ID HAVING COUNT(*) > 1

On my SQL Server 2008, one ID (662) was created twice. Thus, the default isolation level applied to single statements is not sufficient.

EDIT: Clearly, wrapping the INSERT with BEGIN TRANSACTION and COMMIT won't fix it, since the default isolation level for transactions is still READ COMMITTED, which is not sufficient. Note that setting the transaction isolation level to REPEATABLE READ is also not sufficient. The only way to make the above code safe is to add

SET TRANSACTION ISOLATION LEVEL SERIALIZABLE

at the top. This, however, caused deadlocks every now and then in my tests.

EDIT: The only solution I found which is safe and does not produce deadlocks (at least in my tests) is to explicitly lock the table exclusively (default transaction isolation level is sufficient here). Beware though; this solution might kill performance:

...loop stuff...
    BEGIN TRANSACTION

    SELECT * FROM Table_1 WITH (TABLOCKX, HOLDLOCK) WHERE 1=0

    INSERT INTO Table_1
      SELECT MAX(ID) + 1 FROM Table_1 

    COMMIT
...loop end...

Heinzi 2010-01-03 13:13:19

Thanks for this Heinzi, it looks quite definitive - I've been trying to think of a way of testing this myself, hadn't thought of using a loop.

David Hall 2010-01-03 13:20:40

Oh - and there is +1 on the way from me tomorrow, I've run out of upvotes for today.

David Hall 2010-01-03 13:34:56

I do not understand why this loop would produce duplicates, it certainly should not imo. I'm going to test this!

Paul Creasey 2010-01-03 13:42:43

@Paul: Why shouldn't it?

Heinzi 2010-01-03 13:48:53

There is no parallelism or concurrency there, just a thousand sequential transactions occuring synchronously.

Paul Creasey 2010-01-03 13:53:21

Why no concurrency? Both Management Studio query windows are executed in parallel...

Heinzi 2010-01-03 13:57:34

oh, I totally missed the two windows part lol, major fail! Apologies!

Paul Creasey 2010-01-03 14:03:24

No problems; thanks for the feedback. I've emphasized it now to avoid misunderstanding in the future.

Heinzi 2010-01-03 14:05:37

Ah yes, REPEATABLE READ will indeed not work since it only acquires a write lock on the relevant row, not a read lock. I'm surprised by the deadlocks though.

Paul Creasey 2010-01-03 14:13:01

Me too. I thought deadlocks only occur when locks are placed by parallel executions *in different order*. I wonder how this could happen here.

Heinzi 2010-01-03 14:44:26

Under SERIALIZABLE model a deadlocks occurs like this: T1 selects MAX(...) and as a result places an range-S lock from BOF to EOF (ie. the entire table including the virtual start and end). T2 selects MAX(...) and also places the same range-S lock from BOF to EOF. T1 attempts to insert, blocks behind T2 range lock. T2 attempts to insert, blocks behind T1's range lock. Deadlock.

Remus Rusanu 2010-01-03 17:22:54

@Remus, thanks for that, makes sense.

Paul Creasey 2010-01-03 17:56:14

Btw, if there is an index on version the range lock will only lock the last version number and the virtual end of range, but that is enough because the insert will conflict with this range and thus the deadlock still occurs.

Remus Rusanu 2010-01-03 18:19:40

@Remus, thanks from me as well. So I guess an UPDLOCK or XLOCK hint on the SELECT might solve the deadlock problem...

Heinzi 2010-01-03 20:53:27

ansaurus

tags:

views:

answers:

In tsql is an Insert with a Select statement safe in terms of concurrency?

related questions