views:

442

answers:

4

I've got a stored procedure, which selects 1 record back. the stored procedure could be called from several different applications on different PCs. The idea is that the stored procedure brings back the next record that needs to be processed, and if two applications call the stored proc at the same time, the same record should not be brought back. My query is below, I'm trying to write the query as efficiently as possible (sql 2008). Can it get done more efficiently than this?

CREATE PROCEDURE GetNextUnprocessedRecord
AS
BEGIN
    SET NOCOUNT ON;

    --ID of record we want to select back
    DECLARE @iID BIGINT     

    -- Find the next processable record, and mark it as dispatched
    -- Must be done in a transaction to ensure no other query can get
    -- this record between the read and update
    BEGIN TRAN

     SELECT TOP 1
      @iID = [ID]
     FROM
      --Don't read locked records, only lock the specific record
      [MyRecords] WITH (READPAST, ROWLOCK)
     WHERE
      [Dispatched] is null
     ORDER BY
      [Received]

     --Mark record as picked up for processing 
     UPDATE 
      [MyRecords]
     SET
      [Dispatched] = GETDATE()
     WHERE
      [ID] = @iID  

    COMMIT TRAN

    --Select back the specific record
    SELECT 
     [ID],
     [Data]
    FROM 
     [MyRecords] WITH (NOLOCK, READPAST)
    WHERE
     [ID] = @iID

END
A: 

It is recommended for high throughput in a highly transactional database system, that a locking mechanism be placed in a separate table, rather than directly in the active table.

Mitch Wheat
I'm not sure what you mean by this, can you elaborate?
Jeremy
downvoting me provides little encouragement to expand!
Mitch Wheat
Sorry, I actually didn't do the downvote downvote. If I did, I'd give an comment explaining why.
Jeremy
A: 

You can assign each picker process a unique id, and add columns pickerproc and pickstate to your records. Then

UPDATE MyRecords
SET pickerproc = myproc,
pickstate = 'I' -- for 'I'n process
WHERE Id = (SELECT MAX(Id) FROM MyRecords WHERE pickstate = 'A') -- 'A'vailable

That gets you your record in one atomic step, and you can do the rest of your processing at your leisure. Then you can set pickstate to 'C'omplete', 'E'rror, or whatever when it's resolved.

I think Mitch is referring to another good technique where you create a message-queue table and insert the Ids there. There are several SO threads - search for 'message queue table'.

le dorfier
that is definitely not safe!
Mitch Wheat
And why would that be? It's not going to hand off the same MAX(Id) with pickstate 'A' until after you've changed it. At least it hasn't in my experience based on probably 10^6+ instances.
le dorfier
..and it won't be efficient either
Mitch Wheat
I'm listening ... what's your argument? I can always learn more about SQL ...
le dorfier
@le dorfier: to make that transactionally consistent, it would need to be wrapped in a SERIALIZABLE transaction. Otherwise, there is a risk of dirty reads/non-repeatable reads/phantom reads/
Mitch Wheat
unless I'm mistaken, the MAX operation is going to scan rows. If this is a big table, it would be holding a tranasction open for a relatively long time.
Mitch Wheat
I'd hate to have to figure out that turgid lock doc again, but I start for the assumption that ACID applies by default. You disagree?
le dorfier
(Upon reflection) I see your point (i.e. the low cardinality of the pickstate column). But it's resolved by the "filtered index" available in SQL 2008. See http://blog.sqlauthority.com/2008/09/01/sql-server-2008-introduction-to-filtered-index-improve-performance-with-filtered-index/
le dorfier
A: 

You can keep MyRecords on a "MEMORY" table for faster processing.

Nir
The table could get quite large, plus, what happens if the server crashes?
Jeremy
+2  A: 

Using the READPAST locking hint is correct and your SQL looks OK.

I'd add use XLOCK though which is also HOLDLOCK/SERIALIZABLE

...
[MyRecords] WITH (READPAST, ROWLOCK, XLOCK)
...

This means you get the ID, and exclusively lock that row while you carry on and update it.

Edit: add an index on Dispatched and Received columns to make it quicker. If [ID] (I assume it's the PK) is not clustered, INCLUDE [ID]. And filter the index too because it's SQL 2008

You could also use this construct which does it all in one go without XLOCK or HOLDLOCK

UPDATE
    MyRecords
SET
    --record the row ID
    @id = [ID],
    --flag doing stuff
    [Dispatched] = GETDATE()
WHERE
    [ID] = (SELECT TOP 1 [ID] FROM MyRecords WITH (ROWLOCK, READPAST) WHERE Dispatched IS NULL ORDER BY Received)

UPDATE, assign, set in one

gbn
Wonderfull tips! I've added the XLOCK hint and created a non clustered index on Received and Dispatched, and included ID with a filter of "Dispatched IS NULL". This is in addition to my clustered index on ID. I'm still trying to figure out if the Update assign set in one is faster or not.
Jeremy
Thank you. I'd worry more about transactional integrity than outright performance. The single statement removes the need for XLOCK.
gbn