What is the proper Isolation Level to enable this multiple reader solution?

I have a database table containing a queue of work items processed concurrently by multiple readers, with these requirements:

Each item should be processed by one reader only.
If a reader fails for any reason, the item should be returned to the queue for another reader to process.

Here is the pseudo code of a possible solution, each reader would do the following:

1) Read next item from queue and store it locally somewhere.

2) START TRANSACTION

3) Delete the item, to prevent other readers from seeing it. If deletion fails, it means another worker already pulled the item so go back to step 1.

4) Process the item (stored locally in step 1). This could be long running.

5) COMMIT TRANSACTION : item deletion is committed, go to step 1 to process next item

6) ROLLBACK TRANSACTION (explicit rollback or reader failure): item deletion is rolled back and returns back to queue for another reader

My question is: what is the lowest isolation level I need to ensure that after step 3 (item deletion), other readers cannot see it? By lowest, I mean I want to maximize concurrency while maintaining integrity.

Note I'm using SQL Server 2005, if it matters (I think this should be product agnostic, right?)

Any other feedback on this approach in general is welcome.

If once you have removed the item from the 'Queue' mechanism, no prcess can access it, you don't need to keep the Transaction open while you actually process the item.

REPEATABLE READ isolation level:

Specifies that statements cannot read data that has been modified but not yet committed by other transactions and that no other transactions can modify data that has been read by the current transaction until the current transaction completes.

Shared locks are placed on all data read by each statement in the transaction and are held until the transaction completes. This prevents other transactions from modifying any rows that have been read by the current transaction. Other transactions can insert new rows that match the search conditions of statements issued by the current transaction. If the current transaction then retries the statement it will retrieve the new rows, which results in phantom reads. Because shared locks are held to the end of a transaction instead of being released at the end of each statement, concurrency is lower than the default READ COMMITTED isolation level. Use this option only when necessary.

This is flawed using READ COMMITTED (the default)

Mitch Wheat 2009-03-04 00:55:30

I think we've had this conversation. I still hold it's not flawed. Have you tested it? I have.

le dorfier 2009-03-04 00:57:23

@le dorfier: I have voted down for now but will talk to some SQL experts, and get back to you...

Mitch Wheat 2009-03-04 01:02:34

...instead I have removed downvote, will ask the question and then get back to you. I firmly believe you are wrong, but I'm willing to learn...

Mitch Wheat 2009-03-04 01:07:54

I am too. The only thing worse than not knowing is knowing what's wrong, and I'm as susceptible as anyone.

le dorfier 2009-03-04 01:10:30

Mitch - the fundamental question I would have is, it seems to me that you are asserting that either a) Microsoft SQL Server statements and transactions do not comply, by default, with ACID requrements; or b) the process I describe above does not describe transactions for ACID purposes. Yes/No?

le dorfier 2009-03-04 01:34:10

With this approach... assuming it avoids race condition between readers (which I'm assuming is in contention here)... requires that I have a cleanup process for orphaned items (readers that pull item and crash)....

DSO 2009-03-04 04:31:24

With my approach when a reader fails the DB will release the transaction automatically and deleted item would return to queue.

DSO 2009-03-04 04:32:00

The difference is that you are holding a transaction open for the entire (long) period of the work, which is in general not a good thing. And if it's something about the work item that caused the reader to fail, you're just going to fail again on the next reader.

le dorfier 2009-03-04 04:49:10

le dorfier: I like your approach of using a State column and a single statement. But Mitch and you seem to disagree that it will work and that concerns me. Is the expected behavior formally documented somewhere? I don't think you can conclusively determine this via testing alone.

DSO 2009-03-04 19:39:10

I come at the question from another direction than isolation level. I start from the fact that SQL Server asserts to support ACID transactions by default (see http://msdn.microsoft.com/en-us/library/ms190612(SQL.90).aspx).

le dorfier 2009-03-04 20:08:25

From that description, and several years experience, (and also the belief that any SQL database would be fatally flawed if it didn't work that way), I think what I've described must work by default.

le dorfier 2009-03-04 20:11:13

I once or twice have worked my way through the isolation level discussion, and worked out what it all means; but it's been quite a while. I find that working from the assumption that ACID applies by default to be sufficient not to mess with isolation levels.

le dorfier 2009-03-04 20:14:26

And I've seen others bitten by overaggressive adjustment (in either direction) that they didn't understand thoroughly. (But that still doesn't mean I'm absolutely sure I know I'm right, and why.)

le dorfier 2009-03-04 20:15:12

I'm not convinced... it seems the "Isolation" part of ACID you are quoting is simply a textbook definition. In practice, databases offer different degrees of isolation, because you would have poor concurrency if you strictly followed the textbook definition.

DSO 2009-03-04 22:44:37

If you chose the Serializable isolation level in SQL server, I think that would meet the textbook definition of I in ACID... however most people would be willing to trade off strict isolation in favor of performance.

DSO 2009-03-04 22:45:41

I'm holding the transaction open in case the reader fails for some reason, in which case I want the deletion to be undone so another reader can process it.

DSO 2009-03-04 04:24:14

Why would Read Committed not work? That MSDN link you sent, it says with read committed, statements cannot read data that has been modified but not committed by other transactions. Doesn't this mean that when a reader does a delete in a transaction, no other readers can read it?

DSO 2009-03-04 04:27:13

ansaurus

tags:

views:

answers:

What is the proper Isolation Level to enable this multiple reader solution?

related questions