views:

38

answers:

2

Given a table (JobTable) that has 2 columns: JobId, JobStatus (there are others but I'm excluding them as they have no impact on the question).

A process, the WorkGenerator, INSERTs rows into the table. Another process, the Worker, executes a stored procedure called GetNextJob.

Right now, GetNextJob does a SELECT to find the next piece of work (JobStatus = 1) and then an UPDATE to mark that work as in-progress (JobStatus = 2).

We're looking to scale up by having multiple Worker processes but have found that it's possible for multiple workers to pick up the same piece of work.

I have the following queries:

  • In GetNextJob, can I combine the SELECT and UPDATE into a single query and use the OUTPUT clause to get the JobId?
  • How can I guarantee that only 1 process will pick up each piece of work?

I appreciate answers that work but also explainations as to why they work.

+2  A: 

Lets build up a solution:

Ensure the UPDATE checks the @@ROWCOUNT

Inspect @@ROWCOUNT after the UPDATE to determine which Worker process wins.

CREATE PROCEDURE [dbo].[GetNextJob] 
AS
BEGIN
    SET NOCOUNT ON;

    DECLARE @jobId INT

    SELECT TOP 1 @jobId = Jobs.JobId FROM Jobs
    WHERE Jobs.JobStatus = 1
    ORDER BY JobId ASC

    UPDATE Jobs Set JobStatus = 2
    WHERE JobId = @jobId
    AND JobStatus = 1;

    IF (@@ROWCOUNT = 1)
    BEGIN
        SELECT @jobId;
    END
END

GO

Note that with the above procedure the process that does not win does not return any rows and needs to call the procedure again to get the next row.

The above will fix most of all the cases where both Workers pick up the same piece of work because the UPDATE guards against this. However, it's possible for @@ROWCOUNT to be 1 for both workers for the same jobId!

Lock the row within a transaction so only 1 Worker can update the Status

CREATE PROCEDURE [dbo].[GetNextJob] 
AS
BEGIN
    SET NOCOUNT ON;

    BEGIN TRANSACTION

        DECLARE @jobId INT

        SELECT TOP 1 @jobId = Jobs.JobId FROM Jobs WITH (UPDLOCK, ROWLOCK)
        WHERE Jobs.JobStatus = 1
        ORDER BY JobId ASC

        UPDATE Jobs Set JobStatus = 2
        WHERE JobId = @jobId
        AND JobStatus = 1;

        IF (@@ROWCOUNT = 1)
        BEGIN
            SELECT @jobId;
        END

    COMMIT
END

GO

Both UPDLOCK and ROWLOCK are required. UPDLOCK on the SELECT tells MSSQL to lock the row as if it is being updated until the transaction is committed. The ROWLOCK (probably isn't necessary) but tells MSSQL to only lock the ROW returned by the SELECT.

Optimising the locking

When 1 process uses the ROWLOCK hint to lock a row, other processes are blocked waiting for that lock to be released. The READPAST hint can be specified. From MSDN:

When READPAST is specified, both row-level and page-level locks are skipped. That is, the Database Engine skips past the rows or pages instead of blocking the current transaction until the locks are released.

This will stop the other processes from being blocked and improve performance.

CREATE PROCEDURE [dbo].[GetNextJob] 
AS
BEGIN
    SET NOCOUNT ON;

    BEGIN TRANSACTION
        DECLARE @jobId INT

        SELECT TOP 1 @jobId = Jobs.JobId FROM Jobs WITH (UPDLOCK, READPAST)
        WHERE Jobs.JobStatus = 1
        ORDER BY JobId ASC

        UPDATE Jobs Set JobStatus = 2
        WHERE JobId = @jobId
        AND JobStatus = 1;

        IF (@@ROWCOUNT = 1)
        BEGIN
            SELECT @jobId;
        END

    COMMIT
END

GO

To Consider: Combine SELECT and Update

Combine the SELECT and UPDATE and use a SET to get the ID out.

For example:

DECLARE @val int

UPDATE JobTable 
SET @val = JobId, 
status = 2 
WHERE rowid = (SELECT min(JobId) FROM JobTable WHERE status = 1) 

SELECT @val 

This still requires the transaction to be SERIALIZABLE to ensure that each row is allocated to one Worker only.

To Consider: Combine SELECT and UPDATE again

Combine the SELECT and UPDATE and use the Output clause.

Iain
You might find this of interest http://rusanu.com/2010/03/26/using-tables-as-queues/
Martin Smith
Also you might want to batch your edits a bit or you'll [edit this answer into community wiki status](http://meta.stackoverflow.com/questions/57971/my-answer-got-converted-to-community-wiki/57972#57972) before you know it!
Martin Smith
Thanks Martin, wish there was a 'save draft' feature!
Iain