views:

443

answers:

1

Hi folks,

I've got the following deadlock graph that describes two sql statements that are deadlocking each other. I'm just not sure how to analyse this problem and then fix up my sql code to prevent this from happening.

Main deadlock graph

alt text Click here for a bigger image.

Left side, details

alt text Click here for a bigger image.

Right side, details

alt text Click here for a bigger image.

Raw Deadlock Schema xml file

Click here to download the xml file.

Table schema

alt text

LogEntries Table details

alt text

Connected Clients Table details

alt text

What is the code doing?

I'm reading in a number of files (eg. lets say 3, for this example) at the same time. Each file contains different data BUT the same type of data. I then insert the data into LogEntries table and then (if required) I insert or delete something from the ConnectedClients table.

Here's my sql code.

using (TransactionScope transactionScope = new TransactionScope())
{
    _logEntryRepository.InsertOrUpdate(logEntry);

    // Now, if this log entry was a NewConnection or an LostConnection, then we need to make sure we update the ConnectedClients.
    if (logEntry.EventType == EventType.NewConnection)
    {
        _connectedClientRepository.Insert(new ConnectedClient { LogEntryId = logEntry.LogEntryId });
     }

    // A (PB) BanKick does _NOT_ register a lost connection .. so we need to make sure we handle those scenario's as a LostConnection.
    if (logEntry.EventType == EventType.LostConnection ||
        logEntry.EventType == EventType.BanKick)
    {
        _connectedClientRepository.Delete(logEntry.ClientName, logEntry.ClientIpAndPort);
    }

    _unitOfWork.Commit();
    transactionScope.Complete();
}

Now each file has it's own UnitOfWork instance (which means it has it's own database connection, transaction and repository context). So i'm assuming this means there's 3 different connections to the db all happening at the same time.

Finally, this is using Entity Framework as the repository, but please don't let that stop you from having a think about this problem.

Using a profiling tool, the Isolation Level is Serializable. I've also tried ReadCommited and ReadUncommited, but they both error :-

  • ReadCommited: same as above. Deadlock.
  • ReadUncommited: different error. EF exception that says it expected some result back, but got nothing. I'm guessing this is the LogEntryId Identity (scope_identity) value that is expected but not retrieve because of the dirty read.

Please help!

PS. It's Sql Server 2008, btw.


Update #2

After reading Remus Rusanu's updated reply, I felt that I could try and provide a bit more information to see if someone else can further help.

EF Diagram

alt text

Now, Remus suggests (and note, he does say he's unfamiliar with EF)...

The last piece of the puzzle, the unexplained lock left node has on the PK_ConnectedClients, I will assume is from the EF implementation of InsertOrUpdate. It probably does a lookup first, and because of the FK relationship declared between ConnectedClients and LogEntries, it seeks on PK_ConnectedClients, hence acquiring the serializable lock.

Interesting. I'm not sure why the left node has a lock on PK_ConnectedClients, as suggested above. Ok, lets check out the code for that method....

public void InsertOrUpdate(LogEntry logEntry)
{
    LoggingService.Debug("About to InsertOrUpdate a logEntry");

    logEntry.ThrowIfArgumentIsNull("logEntry");

    if (logEntry.LogEntryId <= 0)
    {
        LoggingService.Debug("Current logEntry instance doesn't have an Id. Instance object will be 'AddObject'.");
        Context.LogEntries.AddObject(logEntry);
    }
    else
    {
        LoggingService.Debug("Current logEntry instance has an Id. Instance object will be 'Attached'.");
        Context.LogEntries.Attach(logEntry);
    }
}

Hmm. it's a simple AddObject (aka. Insert) or Attach (aka. Update). No references. Sql code also doesn't hint of any lookup stuff.

Ok then ... i do have two other methods ... maybe they are doing some lookups?

In ConnectedClientRepository ...

public void Insert(ConnectedClient connectedClient)
{
    connectedClient.ThrowIfArgumentIsNull("connectedClient");

    Context.ConnectedClients.AddObject(connectedClient);
}

Nope -> also basic, as.

Lucky last method? Wow .. now this is interesting....

public void Delete(string clientName, string clientIpAndPort)
{
    clientName.ThrowIfArgumentIsNullOrEmpty("clientName");
    clientIpAndPort.ThrowIfArgumentIsNullOrEmpty("clientIpAndPort");

    // First we need to attach this object to the object manager.
    var existingConnectedClient = (from x in GetConnectedClients()
                                   where x.LogEntry.ClientName == clientName.Trim() &&
                                   x.LogEntry.ClientIpAndPort == clientIpAndPort.Trim() &&
                                   x.LogEntry.EventTypeId == (byte)EventType.NewConnection
                                   select x)
                                  .Take(1)
                                  .SingleOrDefault();

    if (existingConnectedClient != null)
    {
        Context.ConnectedClients.DeleteObject(existingConnectedClient);
    }
}

So, looking above, i grab an instance of the record i wish to delete .. and if it exists, then delete it.

So .. if i comment out that method call, in my initial logic way up to the top of this SO post ... what happens?

it works. WOWZ.

It also works as either Serializable or Read Commited - both work when i don't call the Delete method.

So why would that delete method be getting a lock? Is it becuase the select (with serializable) does a lock and some deadlock happens?

With read committed, is it possible that i have 3 calls to the delete happen at the same time.

  • 1st grabs an instance of the data.
  • 2nd (and 3rd) grabs another instance of the same data.
  • Now, 1st delete's. fine.
  • 2nd deletes .. but the row has gone .. so therefore i get that weird error about affected an unexpected number of rows (0). <== zero items deleted.

Possible? If so .. er ... how can I fix this? Is this a classic case of a race condition? Is it possible to prevent this from happeneing somehow?


Updates

  • Fixed the links to the images.
  • Link to the raw XML deadlock file. Here is the same link.
  • Added database table schema.
  • Added both table details.
+3  A: 

The left side node is holding an RangeS-U lock on PK_CustomerRecords and wants an RangeS-U lock on i1 (I assume its an index on LogEntries). The right side node has an RangeS-U lock on i1 and wants an RangeI-N on PK_CustomerRecords.

Apparently the deadlock occurs between the _logEntriesRepository.InsertOrUpdate (left node) and _connectedClientRepository.Insert (right node). Without knowing the type of EF relations declared, I can't comment on why does the left side node have a lock on PK_CustomerRecords at the moment it inserts the LogEntry. I suspect this is caused by either a ORM type behavior induced by EF, like lookup of a 'preloaded' member, or it may be caused by a higher level TransactionScope that surrounds the scope in the code snipped posted.

As other have said, it is necessary to post the database schema in a deadlock evaluation, because the access path (indexes used) is critical. See my article Read-Write deadlock for a more detailed discussion on the implication of indexes in deadlocks.

My first recommendation would be to force the transaction scope to be read committed. The default serializable level of TransactionScopes is almost never needed in practice, is a performance hog, and in this particular case ads a lot of unnecessary noise to the deadlock investigation by bringing the range locks into the equation, complicating everything. Please post the deadlock information that occurs under read committed.

Also, don't post a picture of the deadlock graph. A picture tells a thousand words is not true here, post the original deadlock XML: it has a lot of information not visible in the pretty pictures.

Update

From the deadlock XML I can see that the left node is executing insert [dbo].[LogEntries]([GameFileId], [CreatedOn], [EventTypeId], [Message], [Code], [Violation], [ClientName], [ClientGuid], [ClientIpAndPort]) values (@0, @1, @2, null, null, null, @3, @4, @5) (the <executionStack><frame> element). But more importantly, I can see the object behind the misterious index 'i1': objectname="AWing.sys.fulltext_index_docidstatus_1755869322" indexname="i1". So the deadlock occurs on a fulltext index.

So the full explanation of deadlock is:

  • right node is at _connectedClientRepository.Insert, it needs an range insert lock on PK_ConnectedClients. It has an RangeS-U lock on the fulltext index i1 from previously executing _logEntryRepository.InsertOrUpdate.
  • left node is at the _logEntryRepository.InsertOrUpdate, at the INSERT statement inside the batch, and it needs a RangeS-U lock on the fulltext index i1. It has a RangeS-S lock on PK_ConnectedClients that blocks the right node, and this is not explained by anything in the graph XML.

The last piece of the puzzle, the unexplained lock left node has on the PK_ConnectedClients, I will assume is from the EF implementation of InsertOrUpdate. It probably does a lookup first, and because of the FK relationship declared between ConnectedClients and LogEntries, it seeks on PK_ConnectedClients, hence acquiring the serializable lock.

The primary culprit here are the transaction isolation level (Serializable) and the EF behavior on InsertOrUpdate. I can't give advice on the EF behavior, but the serializable level is overkill for sure. Which takes us back to the error you get under read committed level, which, unfortunately, is again an EF error I can't comment on.

Remus Rusanu
First up, I've added the link to the RAW xml deadlock data, in the Opening Post (down the bottom). I'll try and get the schema, indexes and try the new scope.
Pure.Krome
Just tried to use ReadCommited. I get the following error during the import process for one of the files: `"Store update, insert, or delete statement affected an unexpected number of rows (0). Entities may have been modified or deleted since entities were loaded. Refresh ObjectStateManager entries."`
Pure.Krome
@Remus Runsanu : wow. awesome dude :) I've now updated my initial post with some more info. Please can you have a look at Update #2 in the initial post of this question? I think we're getting closer now :)
Pure.Krome
@Remus Runsanu : heya .. can I bump this please to grab your attention?
Pure.Krome
Sry Krome, but this goes too deep into EF behavior for my zone of comfort, from here on I'd be more speculating than actually answering.
Remus Rusanu
Cheers. np at all :)
Pure.Krome