views:

1301

answers:

5
+2  Q: 

SQLServer deadlock

Hi,

I have a java application which is doing multiple concurrent CRUD operations on a database. I'm adding support for SQLServer but am having problems with deadlocking during concurrent deletes. After some investigation it appeared that the problem may be due to lock escalation on a particular table.

In an attempt to fix it, I decided to make all reads on the table in question be done "for update" using the UPDLOCK hint so that the deadlock could be avoided. However, I'm still seeing the problem. I've enabled tracing in SQLServer and have found the following deadlock trace in the SQLServer logs:

Deadlock encountered .... Printing deadlock information Wait-for graph

Node:1 KEY: 5:72057594042384384 (54048e7b3828) CleanCnt:3 Mode:X Flags: 0x0 Grant List 1: Owner:0x03D08C40 Mode: X Flg:0x0 Ref:0 Life:02000000 SPID:62 ECID:0 XactLockInfo: 0x04834274 SPID: 62 ECID: 0 Statement Type: DELETE Line #: 1 Input Buf: Language Event: (@P0 nvarchar(4000))delete from part_data where part_id = @P0 Requested By: ResType:LockOwner Stype:'OR'Xdes:0x04B511C8 Mode: U SPID:60 BatchID:0 ECID:0 TaskProxy:(0x058BE378) Value:0x3d08500 Cost:(0/1296)

Node:2

KEY: 5:72057594042384384 (f903d6d6e0ac) CleanCnt:2 Mode:X Flags: 0x0 Grant List 0: Owner:0x03D088A0 Mode: X Flg:0x0 Ref:0 Life:02000000 SPID:60 ECID:0 XactLockInfo: 0x04B511EC SPID: 60 ECID: 0 Statement Type: DELETE Line #: 1 Input Buf: Language Event: (@P0 nvarchar(4000))delete from part_data where part_id = @P0 Requested By: ResType:LockOwner Stype:'OR'Xdes:0x04834250 Mode: U SPID:62 BatchID:0 ECID:0 TaskProxy:(0x047BA378) Value:0x3d089e0 Cost:(0/4588)

Victim Resource Owner: ResType:LockOwner Stype:'OR'Xdes:0x04B511C8 Mode: U SPID:60 BatchID:0 ECID:0 TaskProxy:(0x058BE378) Value:0x3d08500 Cost:(0/1296)

SQLServer profiler shows this as two clients holding update (U) locks and attempting to escalate to exclusive (X) locks. The SQLServer docs I have read say that only one client can have a (U) lock on a table at a given time, so I'm wondering why I'm seeing the situation shown in the trace.

The database object refered to in that trace is an index on a foreign key. If anyone with experience of fixing this sort of problem could offer advice it would be a big help.

Thanks, Brad.

EDIT added deadlock graph xml as requested:

<deadlock-list>
 <deadlock victim="process989018">
  <process-list>
   <process id="process6aa7a8" taskpriority="0" logused="4844" waitresource="KEY: 5:72057594042384384 (5504bdfb7529)" waittime="9859" ownerId="613553" transactionname="implicit_transaction" lasttranstarted="2009-05-08T11:52:39.137" XDES="0x5fcbc30" lockMode="U" schedulerid="1" kpid="3516" status="suspended" spid="59" sbid="0" ecid="0" priority="0" transcount="2" lastbatchstarted="2009-05-08T11:52:39.183" lastbatchcompleted="2009-05-08T11:52:39.183" clientapp="jTDS" hostname="LOIRE" hostpid="123" loginname="sa" isolationlevel="read committed (2)" xactid="613553" currentdb="5" lockTimeout="4294967295" clientoption1="671088672" clientoption2="128058">
    <executionStack>
     <frame procname="adhoc" line="1" stmtstart="40" sqlhandle="0x0200000007c76c39efdd8317c6fa7b611b4fd958f05cfcf4">
delete from part_data where part_id =  @P0     </frame>
    </executionStack>
    <inputbuf>(@P0 nvarchar(4000))delete from part_data where part_id = @P0</inputbuf>
   </process>
   <process id="process989018" taskpriority="0" logused="1528" waitresource="KEY: 5:72057594042384384 (5e0405cb0377)" waittime="1250" ownerId="613558" transactionname="implicit_transaction" lasttranstarted="2009-05-08T11:52:39.183" XDES="0x48318f0" lockMode="U" schedulerid="2" kpid="2692" status="suspended" spid="60" sbid="0" ecid="0" priority="0" transcount="2" lastbatchstarted="2009-05-08T11:52:39.183" lastbatchcompleted="2009-05-08T11:52:39.183" clientapp="jTDS" hostname="LOIRE" hostpid="123" loginname="sa" isolationlevel="read committed (2)" xactid="613558" currentdb="5" lockTimeout="4294967295" clientoption1="671088672" clientoption2="128058">
    <executionStack>
     <frame procname="adhoc" line="1" stmtstart="40" sqlhandle="0x0200000007c76c39efdd8317c6fa7b611b4fd958f05cfcf4">
delete from part_data where part_id =  @P0     </frame>
    </executionStack>
    <inputbuf>(@P0 nvarchar(4000))delete from part_data where part_id =  @P0</inputbuf>
   </process>
  </process-list>
  <resource-list>
   <keylock hobtid="72057594042384384" dbid="5" objectname="MESSAGESTOREDB61.dbo.part_data" indexname="idx_part_data_part_id" id="lock3cab740" mode="X" associatedObjectId="72057594042384384">
    <owner-list>
     <owner id="process6aa7a8" mode="X"/>
    </owner-list>
    <waiter-list>
     <waiter id="process989018" mode="U" requestType="wait"/>
    </waiter-list>
   </keylock>
   <keylock hobtid="72057594042384384" dbid="5" objectname="MESSAGESTOREDB61.dbo.part_data" indexname="idx_part_data_part_id" id="lock3cad340" mode="X" associatedObjectId="72057594042384384">
    <owner-list>
     <owner id="process989018" mode="X"/>
    </owner-list>
    <waiter-list>
     <waiter id="process6aa7a8" mode="U" requestType="wait"/>
    </waiter-list>
   </keylock>
  </resource-list>
 </deadlock>
</deadlock-list>
A: 

I'm assuming that you ran something like: DBCC TRACEON(1222,-1)" and/or "DBCC TRACEON(1204 ,-1). I find the deadlock trace in the SQLServer logs difficult to read through. Are you sure that it is an attempt to escalate to exclusive (X) locks?

Try running the run a trace in the profiler (pick the blank template), select the "deadlock graph event", and on the new tab that appears (Events Extraction Settings), save each (check "save deadlock XML events separately") in its own file. Open this file in an xml viewer and it will be easy to tell what is happening. Each process is contained, with an execution stack of procedure/trigger/etc calls, etc. and all locks are in there too. I find it hard to believe that the two DELETEs where col=@value are causing the problem, look at the execution stack, is there a trigger or something else happening?

Look at the "resource list" section of the file, it will show what is being locked and held by each process causing the deadlock. Figure out how to not have one of these locked and the deadlock will be resolved.

KM
Hi, thanks for commenting. I have used the profiler previously but I can see now I was interpreting the output incorrectly. From the XML resource list it appears the two clients already have (X) locks and are in fact requesting (U) locks.From the execution stack, the only things showing are the two delete statements.
Brad
how may deletes are you trying to do in each process? multiple or just the single one for each process? If you are doing multiple deletes in a loop (with a transaction) in each process, then you may have locking issues because the locks are not on rows but on pages of multiple rows. please edit your question and include the deadlock graph xml so I can better see what you are seeing.
KM
The transaction is performing a delete on a graph of objects which go together to make up a logical unit of data. The table that the delete is failing on corresponds to the bottom-most object in this graph. Its likely that there are more than one of these objects but a delete is only done once at this level as it is a "delete where foreign key = ?".I'm still curious as to why the deadlock is happening on the index rather than the table itself.
Brad
are you in a loop deleting one row at a time?
KM
one row in which table? If you mean the table which is getting the deadlock then no. As per my last comment, its a "delete where" matching on the foreign key column.
Brad
Although, I will add that these deletes are done in a loop. As in its looping through keys from the parent table and deleteing all child entries for the parent key. Perhaps that's the cause?
Brad
try to delete all children at one time (one command) then delete the parent in another command.
KM
A: 

deadlocks on SQLServer almost always originate from the fact that a single thread tries to write and read using two connections and therefore two transactions. If you want to make this work, do all operations in a single thread using ONE connection and make absolutely sure you're indeed re-using the connection. It might be due to the layering in your application you're accidently using a different connection for reading during a transaction which makes that read wait for the other code (using another connection) to complete which never happens as the read never finishes.

Example (pseudo steps)

start trans

  • write data (e.g. INSERT, UPDATE) to table Foo
  • read some data (which triggers table scan) from Foo using different connection DEADLOCK, as read never finishes, so transactio is never committed
Frans Bouma
Frans, that's an interesting point. The JDBC stuff is done through Spring JDBC templating so I'm not actually handling any connections myself, but it is possible that a read is done inside a transaction next to an update or delete. I'm going to go back and look for the kind of situation you described
Brad
After some further investigation I've noticed that one of the suspect transactions does have an insert mixed in with the deletes. My application uses four service classes for database access which group the operations as reads, updates, inserts and deletes.I guess the problem could be that the insert uses a different conection to the deletes as it is in a seperate JDBC template class. I thought Spring transaction management would force the operations to use a single connection, perhaps I'm wrong?
Brad
I've never worked with Spring so I have no idea if it re-uses a transaction in progress, I could imagine it would but also I also could imagine it wouldn't (if you for example connect to a different db, it HAS TO create a new one)
Frans Bouma
+1  A: 

Welcome to awful.

The last time I ran into a situation like this it was because an update or a delete was unable to find a good index to help it isolate the rows it was affecting. This caused an erratic lock escalation as it was using a non-covering index to modify locate the records.

So if you can isolate some of the queries, check the sql for them and see if you cannot experiment with providing some covering indexes.

A covering index is an index that includes all the fields in the specific where clause.

Nathan Feger
A: 

...I decided to make all reads on the table in question be done "for update" using the UPDLOCK hint so that the deadlock could be avoided...

...Yes there is a transaction declared to ensure the whole object graph is deleted or left intact should there be an error. There are no other database operations in the transaction, some reads are done before the transaction is started but none inside it...

I'm not sure I understand the reasoning behind escalating the read locks outside the transaction, I would think that this would only exacerbate the problem. My experience has been that specifying locking hints preemptively causes more harm than good.

That being said...

It sounds like you are trying to delete multiple rows within each transaction and that the deadlocking is occurring because SQL is escalating the lock level on the table for each. Remember that in SQL Server Row, Key-Range and Page locks all escalate immediately to Table Locks. If you have multiple transactions deleting multiple rows they're going to attempt to escalate and you're going to get deadlocks.

This will kill performance if you have many users but try specifying TABLOCK hints on your delete statements and see if the problem goes away.

Also take a look at the execution plans for your SQL Statements in see how the lock escalations are occurring.

CptSkippy
A: 

First do not use hints, usually SQL Server is better left on his own.

Secondo verify that you do not have a REAL deadlock (but it happens much less often of the deadlock appearance can hint).

Third, as someone suggested, check if you've some slow query and tune it.

In my experience every single time a customer has reported a deadlock situation this happened because there was a slow running query that escalates and never a real deadlock and tuning the query or adding a specific index always solved the problem.

I do not see which version of SQL Server you're talking about but each following version has a better deadlock management than the previous one and SQL Server 2000 was particularly nasty in this topic.

Regards
Massimo

massimogentilini
Hi Massimo, thanks for your answer. Bit busy at the moment but I will revisit this soon and get back to you.
Brad