Our app needs to add large amounts of text to SQL Server 2005 database (up to 1 GB for a single record). For performance reasons, this is done in chunks, by making a stored procedure call for each chunk (say, usp_AddChunk). usp_AddChunk does not have any explicit transactions.
What I'm seeing is that reducing the chunk size from 100MB to 10MB results in massively larger transaction logs. I've been told this is because each time usp_AddChunk is called, an "implicit" (my term) transaction will log all of the existing text. So, for a 150MB record:
100MB chunk size: 100 (0 bytes logged) + 50 (100 MB logged) = 100 MB logged
will be smaller than
10 MB chunk size: 10 (0 bytes logged) + 10 (10 MB logged) + 10 (20 MB logged) ... + 10 (140 MB logged) = 1050 MB logged
I thought that by opening a transaction in my C# code (before I add the first chunk, and commit after the last chunk), this "implicit" transaction would not happen, and I could avoid the huge log files. But my tests show the transaction log growing 5x bigger using the ADO.NET transaction.
I won't post the code, but here's a few details:
- I call SqlConnection.BeginTransaction()
- I use a different SqlCommand for each chunk
- I assign the SqlTransaction from (1) to each SqlCommand
- I usually close the connection after each SqlCommand execution, but I've also tried not closing the connection with the same results
What's the flaw in this scheme? Let me know if you need more info. Thanks!
Note: using a simple or bulk-logged recovery model is not an option