views:

39

answers:

1

I've written a linq-to-sql program that essentially performs an ETL task, and I've noticed many places where parallelization will improve its performance. However, I'm concerned about preventing uniquness constraint violations when two threads perform the following task (psuedo code).

Record CreateRecord(string recordText)
{
    using (MyDataContext database = GetDatabase())
    {
        Record existingRecord = database.MyTable.FirstOrDefault(record.KeyPredicate());
        if(existingRecord == null)
        {
            existingRecord = CreateRecord(recordText);
            database.MyTable.InsertOnSubmit(existingRecord);
        }

        database.SubmitChanges();
        return existingRecord;
    }
}

In general, this code executes a SELECT statement to test for record existance, followed by an INSERT statement if the record doesn't exist. It is encapsulated by an implicit transaction.

When two threads run this code for the same instance of recordText, I want to prevent them from simultaneously determining that the record doesn't exist, thereby both attempting to create the same record. An isolation level and explicit transaction will work well, except I'm not certain which isolation level I should use -- Serializable should work, but seems too strict. Is there a better choice?

+1  A: 

I use SQL similar to what is shown below to avoid such situations. UPDLOCK specifies that update locks are to be taken and held until the transaction completes and HOLDLOCK is equivalent to SERIALIZABLE. SERIALIZABLE makes shared locks more restrictive by holding them until a transaction is completed, instead of releasing the shared lock as soon as the required table or data page is no longer needed, whether the transaction has been completed or not. The scan is performed with the same semantics as a transaction running at the SERIALIZABLE isolation level. HOLDLOCK applies only to the table or view for which it is specified and only for the duration of the transaction defined by the statement that it is used in. HOLDLOCK cannot be used in a SELECT statement that includes the FOR BROWSE option.

declare @LocationID          int
declare @LocationName        nvarchar (50)

/* fill in LocationID and LocationName appropriately */

INSERT dbo.Location
(LocationID, LocationName)
SELECT @LocationID, @LocationName
WHERE NOT EXISTS (
   SELECT L.*
   FROM dbo.Location L WITH (UPDLOCK, HOLDLOCK)
   WHERE L.LocationID = @LocationID)

According to the answer to this question, Serializable seems to be the way to go.

Jesse C. Slicer