views:

60

answers:

4

I have an insert statement that was deadlocking using linq. So I placed it in a stored proc incase the surrounding statements were affecting it.

Now the Stored Proc is dead locked. Something about the insert statement is locking itself according to the Server Profiler. It claims that two of those insert statements were waiting for the PK index to be freed:

When I placed the code in the stored procedure it is now stating that this stored proc has deadlocked with another instance of this stored proc.

Here is the code. The select statement is similar to that used by linq when it did its own query. I simply want to see if the item exists and if not then insert it. I can find the system by either the PK or by some lookup values.

       SET NOCOUNT ON;
       BEGIN TRY
        SET TRANSACTION ISOLATION LEVEL SERIALIZABLE

        BEGIN TRANSACTION SPFindContractMachine
        DECLARE @id int;
        set @id = (select [m].pkID from Machines as [m]
                        WHERE ([m].[fkContract] = @fkContract) AND ((
                        (CASE 
                            WHEN @bByID = 1 THEN 
                                (CASE 
                                    WHEN [m].[pkID] = @nMachineID THEN 1
                                    WHEN NOT ([m].[pkID] = @nMachineID) THEN 0
                                    ELSE NULL
                                 END)
                            ELSE 
                                (CASE 
                                    WHEN ([m].[iA_Metric] = @lA) AND ([m].[iB_Metric] = @lB) AND ([m].[iC_Metric] = @lC) THEN 1
                                    WHEN NOT (([m].[iA_Metric] = @lA) AND ([m].[iB_Metric] = @lB) AND ([m].[iC_Metric] = @lC)) THEN 0
                                    ELSE NULL
                                 END)
                         END)) = 1));
        if (@id IS NULL)
        begin
            Insert into Machines(fkContract, iA_Metric, iB_Metric, iC_Metric, dteFirstAdded) 
                values (@fkContract, @lA, @lB, @lC, GETDATE());

            set @id = SCOPE_IDENTITY();
        end

        COMMIT TRANSACTION SPFindContractMachine

        return @id;

    END TRY
    BEGIN CATCH
        if @@TRANCOUNT > 0
            ROLLBACK TRANSACTION SPFindContractMachine
    END CATCH
+2  A: 

Get rid of the transaction. It's not really helping you, instead it is hurting you. That should clear up your problem.

Russell McClure
+5  A: 

Any procedure that follows the pattern:

BEGIN TRAN
check if row exists with SELECT
if row doesn't exist INSERT
COMMIT

is going to run into trouble in production because there is nothing to prevent two treads doing the check simultaneously and both reach the conclusion that they should insert. In particular, under serialization isolation level (as in your case), this pattern is guaranteed to deadlock.

A much better pattern is to use database unique constraints and always INSERT, capture duplicate key violation errors. This is also significantly more performant.

Another alternative is to use the MERGE statement:

create procedure usp_getOrCreateByMachineID
    @nMachineId int output,
    @fkContract int,
    @lA int,
    @lB int,
    @lC int,
    @id int output
as
begin
    declare @idTable table (id int not null);
    merge Machines as target
        using (values (@nMachineID, @fkContract, @lA, @lB, @lC, GETDATE()))
            as source (MachineID, ContractID, lA, lB, lC, dteFirstAdded)
    on (source.MachineID = target.MachineID)
    when matched then
        update set @id = target.MachineID
    when not matched then
        insert (ContractID, iA_Metric, iB_Metric, iC_Metric, dteFirstAdded)
        values (source.contractID, source.lA, source.lB, source.lC, source.dteFirstAdded)
    output inserted.MachineID into @idTable;
    select @id = id from @idTable;
end 
go

create procedure usp_getOrCreateByMetrics
    @nMachineId int output,
    @fkContract int,
    @lA int,
    @lB int,
    @lC int,
    @id int output
as
begin
    declare @idTable table (id int not null);
    merge Machines as target
        using (values (@nMachineID, @fkContract, @lA, @lB, @lC, GETDATE()))
            as source (MachineID, ContractID, lA, lB, lC, dteFirstAdded)
    on (target.iA_Metric = source.lA
        and target.iB_Metric = source.lB
        and target.iC_Metric = source.lC)
    when matched then
        update set @id = target.MachineID
    when not matched then
        insert (ContractID, iA_Metric, iB_Metric, iC_Metric, dteFirstAdded)
        values (source.contractID, source.lA, source.lB, source.lC, source.dteFirstAdded)
    output inserted.MachineID into @idTable;
    select @id = id from @idTable;
end 
go

This example separates the two cases, since T-SQL queries should never attempt to resolve two different solutions in one single query (the result is never optimizable). Since the two tasks at hand (get by mahcine id and get by metrics) are completely separate, the should be separate procedures and the caller should call the apropiate one, rather than passing a flag. This example shouws how to achieve the (probably) desired result using MERGE, but of course, a correct and optimal solution depends on the actual schema (table definition, indexes and cosntraints in place) and on the actual requirements (is not clear what the procedure is expected to do if the criteria is already matched, not output and @id?).

By eliminating the SERIALIZABLE isolation, this is no longer guaranteed to deadlock, but it may still deadlock. Solving the deadlock is, of course, completely dependent on the schema which was not specified, so a solution to the deadlock cannotactually be provided in this context. There is a sledge hammer of locking all candidate row (force UPDLOCK or even TABLOCX) but such a solution would kill throughput on heavy use, so I cannot recommend it w/o knowing the use case.

Remus Rusanu
how would I specify that column A, B, C should be unique per fkContract?
BabelFish
so I would make a constraint on fkContract, A, B, and C. Then my stored proc would essentially insert (ignore the error if it was already there) and then do a select to find the primary key. This would all happen without the transaction. is that more along the lines of what you were thinking if I use the constraint?
BabelFish
Your case is a bit more complex, as you not only want to INSERT, you also want to return the @id (as an OUTPUT value of course, never use return values in stored procs) of an already existing entry. First question is why did you add the SERIALIZABLE isolation level?
Remus Rusanu
I updated to show how MERGE could be used
Remus Rusanu
I did serializable because I thought that would provide a better locking mechanism to ensure only 1 proc would have access. I was wrong in my assumption. I think I misread some sites
BabelFish
with the merge solution would you still recommend the constraints for fkContract, A, B, and C? Why should you not use "return" in stored procedures?
BabelFish
To clarify, are you saying "never use return values" or "never use 'return' values, as in 'return @id;'.
BabelFish
lastly can you explain how my orig SP was guaranteed to lock?
BabelFish
Serializable isolation guarantees that a read (a SELECT) will return exactly the same rows if is run again later in a transaction. So if two transactions T1 and T2 both run simultaneously the original SELECT in your sp, they would *both* be guaranteed that this SELECT result cannot change. Say neither found any row, so they would both proceed to insert. Say T1 inserts first. If T1 would succeed to insert and commit and T2 would run the SELECT again, it would see T1's insert. So T1 must not be able to insert, or it would break the serializable read done by T2...
Remus Rusanu
Now you turn around and apply the very same logic to T2. It's INSERT cannot succeed and commit, because if it would be so then it would mean that if T1 repeats the SELECT, it would see T2's insert. So T1 is blocked by T2 and T2 is blocked by T1, hence the deadlocks. The actual implementation of serializable isolation is done using range locks and this is why you see Range-S and Range-N locks in your deadlocks, but that is just how is implemented (2/2)
Remus Rusanu
About return values: the client side APIs (ODBC, OleDB, ADO.Net, ADO, JDBC, PHP drivers etc etc) vary vastly between them how they support the return values from procedures. The data access and ORM layers (nHibernate, Linq2SQL, EF etc) also threat return values very differently, and there are frameworks that don't even give access to return values. Another problem is that data types in the DB change, an int ID can become bigint next year, or a decimal, but procedures return type is fixed (int) and cannot follow the change. Lastly, OUTPUT parameters *cannot* be ignored. All in all, return sucks
Remus Rusanu
Constraint: if possible, always express it in the database. So if (fkContract, A, B, C) are unique add an unique constraint on them. Your code may be correct today and enforce the uniqueness in the DB, but tomorrow a new dev will work on it and make a mistake which will add duplicates discovered only 3 months later when the sheets don't balance up. Or some data may come from a new import job and add the data straight into the tables bypassing your procedure and ruining the unique expectation.
Remus Rusanu
+1, awesome example of the MERGE statement!
Thorin
Question, could you remove the @IdTable table, and simply end on"SELECT @id = MachineID FROM inserted"?
Thorin
+1  A: 

I wonder if adding an UPDLOCK hint to the earlier SELECT(s) would fix this; it should avoid sone deadlock scenarios be preventing another spud getting a read-lock on the data you are about to mutate.

Marc Gravell
+2  A: 

How about this SQL? It moves the check for existing data and the insert into a single statement. This way, when two threads are running they're not deadlocked waiting for each other. At best, thread two is deadlocked waiting for thread one, but as soon as thread one finishes, thread two can run.

BEGIN TRY

  BEGIN TRAN SPFindContractMachine

  INSERT INTO Machines (fkContract, iA_Metric, iB_Metric, iC_Metric, dteFirstAdded)
    SELECT @fkContract, @lA, @lB, @lC, GETDATE()
      WHERE NOT EXISTS (
        SELECT * FROM Machines
        WHERE fkContract = @fkContract
        AND ((@bByID = 1 AND pkID = @nMachineID)
             OR
             (@bByID <> 1 AND iA_Metric = @lA AND iB_Metric = @lB AND iC_Metric = @lC))

  DECLARE @id INT

  SET @id = (
    SELECT pkID FROM Machines
    WHERE fkContract = @fkContract
    AND ((@bByID = 1 AND pkID = @nMachineID)
         OR
         (@bByID <> 1 AND iA_Metric = @lA AND iB_Metric = @lB AND iC_Metric = @lC)))

  COMMIT TRAN SPFindContractMachine

  RETURN @id

END TRY
BEGIN CATCH
  IF @@TRANCOUNT > 0
    ROLLBACK TRAN SPFindContractMachine
END CATCH

I also changed those CASE statements to ORed clauses just because they were easier to read to me. If I recall my SQL theory, ORing might make this query a little slower.

Thorin
interesting.. should it say VALUES instead of SELECT or am I reading this wrong? I will try that tomorrow and see if that helps
BabelFish
so by combining the two statements into 1 it avoids the lock? that doesn't resolve into two queries on the back-end?
BabelFish
I made a test database with a test table with the above fields to test this code before I posted. It does indeed work. It is an INSERT INTO .. SELECT. To see how it works, try just running the SELECT .. WHERE NOT EXISTS statement and you'll find that you get your variable values if the row does not exist, or no values if the row _does_ exist.The two statements have become one statement. With two statements, the deadlock occurred because a statement from a different thread interjected between the two statements. With one statement, nothing else can interject.
Thorin
Oh, and Remus Rusanu's answer works awesome in SQL Server 2008. The MERGE statement is new to SQL Server 2008, I think. If you're still on 2005, my INSERT INTO .. SELECT .. WHERE NOT EXISTS() will do what you want. Remus is right in suggesting this code is split into two procedures, though, one for ByMachineId and one for ByMetrics.
Thorin