views:

666

answers:

2

I need to count sequential timeout exceptions from SqlBulkCopy. To test this, I use an external app to start a transaction & lock up the target table.

Only on the first call does SqlBulkCopy throw a timeout exception when expected. We've tried using an external connection & transaction, as well as using a connection string and internal transaction. With the external connection & transaction, the infinite wait was never in opening the connection or beginning or committing the transaction, but always at .WriteToServer().

Is there some approach to this whereby SqlBulkCopy.WriteToServer() will reliably throw a timeout exception when it has reached its .BulkCopyTimeout limit?

public void BulkCopy(string connectionString, DataTable table, int bulkTimeout)
{
    using (SqlBulkCopy bulkCopy = new SqlBulkCopy(
        connectionString, 
        SqlBulkCopyOptions.UseInternalTransaction))
    {
    bulkCopy.BulkCopyTimeout = bulkTimeout;//e.g. 120 sec.
    //... fill with data, map columns...
    bulkCopy.WriteToServer(table);
    // ^^^^ waits indefinitely, doesn't throw until *after*
    //      the lock is released.
    }
}

I prefer to let exceptions bubble up rather than handle them in the scope of the using block, but I can always rethrow. Thanks much for any insight.

Update 1:

Still no resolution. Interesting behavior discovered though -- a normal SqlCommand will throw a TimeoutException as expected during the same lock that makes the SqlBulkCopy.WriteToServer method hang indefinitely.

Here are approaches that we've tried -- and that have failed -- to get SqlBulkCopy.WriteToServer to consistently throw timeouts when expected:

  • MARS (Multiple Active Result Sets) on/off
  • TableLock on vs. off
  • Destination as heap table vs. indexed table
  • Longer/shorter BulkTimeout values (10 seconds to 5 minutes)
  • Internal vs external transaction

For now, as a workaround, I'm alternating between a) putting the WriteToServer call in an asynchronous wrapper so I can time it myself, and b) only calling WriteToServer once; after timeouts, wait until a regular SqlCommand succeeds before trying WriteToServer again. Using these approaches, I'm at least able to stay in control of the execution flow.

+1  A: 

Have you tried passing in the SqlBulkOptions.TableLock option to SqlBulkCopy? That option (quote) means it will:

Obtain a bulk update lock for the duration of the bulk copy operation.

So, if there is another processing locking the table, it would prevent the lock being gained and in theory, reliably timeout.

Update:
I set up my own test harness and can't reproduce. To lock the table, I started a transaction in SSMS doing a SELECT * FROM TargetTable WITH (HOLDLOCK). I used the same BulkCopy method you included in the question, using internal transactions, with a bulk load timeout of 30 seconds. Each attempt to do the bulk copy times out as expected after 30 seconds. It then succeeds when I rollback the SSMS transaction.

I was using SQL Server 2008 Express, .NET 3.5.

It's not something like after the first attempt, the bulk load timeout is not being passed in correctly? i.e. it's not somehow being set to "indefinite".

Update 2:
Also switched on Multiple Active Result Sets support in the connection string, still consistently times out for me each time.

AdaTheDev
@AdaTheDev: Logical suggestion, thanks. I just tried it, plus added a `try{} catch{}` block around `WriteToServer()` which just calls `bulkCopy.Close()` then rethrows. (That shouldn't be necessary since the SqlBulkCopy is in a `using{}` block, but I'm getting desperate.)Unfortunately, `WriteToServer()` still hangs indefinitely the 2nd time through.
Paul Smith
Hmm, can't think of anything else at the moment tbh - puzzled me. I'll try and knock up a test harness myself to reproduce
AdaTheDev
Paul Smith
PS - .NET 3.5, SQL Server 2008 Enterprise.
Paul Smith
Have tried with Multiple Active Result Sets both on similar behavior. With them off, I sometimes get two timely timeouts instead of one. Win 7 x64 and Win Server 2008 x64. Thanks for your insight this is knocking me for a loop.
Paul Smith
No worries - the only other thing I can suggest is perhaps it's a bug. I'm stumped!
AdaTheDev
@AdaTheDev - May have found a lead on this. Out of curiosity, did you have a PK and/or indexes on the target table when you were trying to reproduce this?I'm still digesting the scenario, but http://social.msdn.microsoft.com/Forums/en-US/sqldatabaseengine/thread/932cd26c-53fc-49c0-b082-e7f5f05a9801 has a discussion which suggests a thread blocking issue could be at work when bulk copying to clustered or indexed tables. I'll post back here if I can confirm or disconfirm that. Thanks again for your work on this.
Paul Smith
@Paul - no, I didn't have a PK or index. If possible, it's best not to have any indexes on the table while bulk loading, to maximise performance. I'm usually loading into new tables, where it's fine not to have indexes initially, and then create them after the load has completed. Let me know how it goes!
AdaTheDev
@AdaTheDev - okay, great, that's progress. Thanks! I hate it when I can't find what makes something work on Setup A and not on B. There are weird IO issues preventing us from de-indexing this table (that sounds counterintuitive, I know), but this gives me an avenue to pursue. Will let you know what I find out. Thanks again.
Paul Smith
I've tried with a PK and other indexes on the table and still can't reproduce. That link suggests a problem when multiple bulk copies are being done to the same table at the same time - is that what you're doing? All my tests have been 1 process, loading into a table. No others to cause a conflict.
AdaTheDev
Almost; to test, I'm doing a bulk copy in Process A, and a begin trans, update with no where clause in Process B. After the first attempt, Process A doesn't throw the timeout exception until I manually commit or rollback in Process B, even it it's an hour later.
Paul Smith
A: 

I had this problem latter and for resolving this problem set BulkCopyTimeout to zero.

bulkCopy.BulkCopyTimeout = 0;
masoud ramezani
In this case, the issue is that the bulkCopy *isn't* timing out. I want it to throw so I can catch and count the timeout exceptions. Then I can evaluate if there have been enough of them to indicate a serious issue.
Paul Smith