views:

65

answers:

2

I have a question about SQL and locking strategies. As an example, suppose I have a view counter for the images on my website. If I have a sproc or similar to perform the following statements:

START TRANSACTION;
UPDATE images SET counter=counter+1 WHERE image_id=some_parameter;
COMMIT;

Assume that the counter for a specific image_id has value '0' at time t0. If two sessions updating the same image counter, s1 and s2, start concurrently at t0, is there any chance that these two sessions both read the value '0', increase it to '1' and both try to update the counter to '1', so the counter will get value '1' instead of '2'?

s1: begin
s1: begin
s1: read counter for image_id=15, get 0, store in temp1
s2: read counter for image_id=15, get 0, store in temp2
s1: write counter for image_id=15 to (temp1+1), which is 1 
s2: write counter for image_id=15 to (temp2+1), which is also 1
s1: commit, ok
s2: commit, ok

End result: incorrect value '1' for image_id=15, should have been 2.

My questions are:

  1. Is this scenario possible?
  2. If so, does the transaction isolation level matter?
  3. Is there a conflict resolver which would detect such a conflict as an error?
  4. Can one use any special syntax in order to avoid a problem (something like Compare And Swap (CAS) or explicit locking techniques)?

I'm interested in a general answer, but if there are none I'm interested in MySql and InnoDB-specific answers, since I'm trying to use this technique to implement sequences on InnoDB.

EDIT: The following scenario could also be possible, resulting in the same behavior. I'm assuming that we are in isolation level READ_COMMITED or higher, so that s2 gets the value from the start of the transaction although s1 already wrote '1' to the counter.

s1: begin
s1: begin
s1: read counter for image_id=15, get 0, store in temp1
s1: write counter for image_id=15 to (temp1+1), which is 1 
s2: read counter for image_id=15, get 0 (since another tx), store in temp2
s2: write counter for image_id=15 to (temp2+1), which is also 1
s1: commit, ok
s2: commit, ok
+2  A: 

If the locking is not done properly it certainly is possible to get this type race condition, and the default locking mode (read committed) does allow it. In this mode, the reads only place a shared lock on the record, so they can both see 0, increment it and write 1 out to the database.

In order to avoid this race condition, you need to set an exclusive lock on the read operation. 'Serializable' and 'Repeatable Read' concurrency modes will do this, and for an operation on a single row they are pretty much equivalent.

To make it completely atomic you have to:

  • Set an appropriate transaction isolation level such as Serializable. Normally you can do this from a client library or explicilty in SQL.
  • Begin the transaction
  • Read the data
  • Update it
  • Commit the transaction.

You can also force an exclusive lock on the read with a HOLDLOCK (T-SQL) or equivalent hint, depending on your SQL dialect.

A single update query will do this atomically but you can't split the operation (perhaps to read the value and return it to the client) without ensuring that the reads take out an exclusive lock. You will need to get the value out atomically in order to implement a sequence, so the update by itself is probably not quite all you need. Even with the atomic update, you still have a race condition to read the value after the update. The read will still have to take place within a transaction (storing what it got in a variable) and issue an exclusive lock during the read.

Note that to do this without creating a hot spot your database needs to have proper support for autonomous (nested) transactions within a stored procedure. Note that sometimes 'nested' is used to refer to chaining transactions or save points, so the term can be a bit confusing. I've edited this to refer to autonomous transactions.

Without autonomous transactions your locks are inherited by the parent transaction (which can roll back the whole lot). This means they will be held until the parent transaction commits, which can turn your sequence into a hot spot that serialises all transactions using that sequence. Anything else trying to use the sequence will block until the whole parent transaction commits.

IIRC Oracle supports autonomous transactions but DB/2 didn't until fairly recently and SQL Server doesn't. Off the top of my head I don't know whether InnoDB supports them, but Grey and Reuter go on at some length about how difficult they are to implement. In practice I'd guess it's quite likely that it might not. YMMV.

ConcernedOfTunbridgeWells
With the single `UPDATE` query, it's not possible to get this race condition in any of the major systems that support transactions, no matter which transaction isolation level is used.
Quassnoi
If I understand the problem correctly, having isolation is not enough, SERIALIZABLE might be, but maybe not snapshot isolation (http://en.wikipedia.org/wiki/Snapshot_isolation), since I'm not sure about the definition of 'conflict' in this case. Are the two ones in conflict or not? And even if you have an exclusive lock on the row during the update, what if s2:s update comes between s1:s update and commit? What will s2 read then? Arguably 0. The only way I can see to ensure the correct behavior in this case would be for s1 to hold the exclusive lock until after its commit.
disown
@Quassnoi: What does another transaction read after the update and before the commit then? If you have for instance READ_COMMITED, letting the other transaction see value '1' would be wrong (would be peeking into the other transaction). '0' is the only other logical value that the other concurrent transaction should be able to see. So either this race condition should be able to occur (and possibly result in a fault), or the db needs to somehow serialize the transactions.
disown
@disown: if `s1` holds the exclusive lock, `s2` cannot update.
Quassnoi
S1 has to block S2 at the read stage in order to prevent S1 and S2 from both seeing 0. The only way to do this is for S1 to take out an exclusive (intent to modify) lock during the read, which requires a Serializable or Repeatable Read isolation level or HOLDLOCK or equivalent hint on the read. Even if wrapped in a transaction a shared lock will not prevent a race condition where S1 and S2 could both read 0 before S1 has written out the updated value.
ConcernedOfTunbridgeWells
ConcernedOfTunbridgeWells
@disown: after update, another transaction will wait for the first transaction to commit.
Quassnoi
@Concerned: an `UPDATE` query does not place shared locks. It reads all data using update locks.
Quassnoi
The update query will lock the records regardless of the isolation level, but you will not be able to retrieve the data during it. i.e. you can't see what it was to display a page count.
ConcernedOfTunbridgeWells
@Concerned: retrieve data where?
Quassnoi
He wants to implement something like sequences, which implies a desire to get the current value of the sequence atomically. Just updating the record in situ isn't going to help him with this unless he can get the sequence number out and update it atomically. The update statement by itself is perhaps not quite the right example to go with his application.
ConcernedOfTunbridgeWells
@Concerned: nothing in the question mentions this desire. Anyway, `UPDATE` supports `RETURNING` clause in `SQL Server 2005` and above.
Quassnoi
@Quassinoi: Yes he does: 'since I'm trying to use this technique to implement sequences on InnoDB.' OTOH I don't know if InnoDB supports RETURNING. However, on SQL Server the inner transactions will still get snagged in the parent's context, which can turn them into a hot spot.
ConcernedOfTunbridgeWells
+1  A: 

UPDATE query places an update lock on the pages or records it reads.

When a decision is made whether to update the record, the lock is either lifted or promoted to the exclusive lock.

This means that in this scenario:

s1: read counter for image_id=15, get 0, store in temp1
s2: read counter for image_id=15, get 0, store in temp2
s1: write counter for image_id=15 to (temp1+1), which is 1 
s2: write counter for image_id=15 to (temp2+1), which is also 1

s2 will wait until s1 decides whether to write the counter or not, and this scenario is in fact impossible.

It will be this:

s1: place an update lock on image_id = 15
s2: try to place an update lock on image_id = 15: QUEUED
s1: read counter for image_id=15, get 0, store in temp1
s1: promote the update lock to the exclusive lock
s1: write counter for image_id=15 to (temp1+1), which is 1 
s1: commit: LOCK RELEASED
s2: place an update lock on image_id = 15
s2: read counter for image_id=15, get 1, store in temp2
s2: write counter for image_id=15 to (temp2+1), which is 2

Note that in InnoDB, DML queries do not lift the update locks from the records they read.

This means that in case of a full table scan, the records that were read but decided not to update, will still remain locked until the end of the transaction and cannot be updated from another transaction.

Quassnoi
Thanks for the great explanation. I think the key phrase here is 'commit: LOCK RELEASED'. This means that all transactions wanting to update the row needs to await the completion of the transaction holding the lock, effectively serializing all transactions contending for the row. Do you know how would this work in a multiple version concurrency db such as Postgres? Since it uses multiple versions I would assume that it will let the transactions progress independently and try to merge the results. Or does it employ the same strategy?
disown
Is there a minimum transaction isolation level necessary for s1 to place an update lock at the start as suggested by ConcernedOfTunbridgeWells in his answer?
disown
@disown: In `MVCC` which `PostgreSQL` uses there is no concept of locking at all. Instead, multiple versions of a record are stored, using transaction identifier as a marker. Locking only occurs when trying to modify a version in limbo.
Quassnoi
@disown: `DML` queries always place `UPDATE` locks to read the records, not regarding transaction isolation level (even if it is `READ UNCOMMITTED`). I'm talking about `SQL Server` now.
Quassnoi
@disown - the update is atomic at any isolation level, but if you read the value after the fact for your sequence you still get a race condition as the read is now not synchronised. You could potentially get both processes reading 2 as the current value.
ConcernedOfTunbridgeWells
Thank you both for great explanations and discussions. I can't accept both answers so I'll just settle for Quassnoi who explained how the update locking works.
disown