views:

2098

answers:

3

What is the recommended batch size for SqlBulkCopy? I'm looking for a general formula I can use as a starting point for performance tuning.

+1  A: 

This all depends on your implementation.

What kind of speed can you expect on your network? Are you using it in Forms or ASP.Net? Do you need to alert the user of progress? What is the size of the total job?

In my experience running bulk copy without a batch size specified will cause timeout issues. I Like to start with something like 1000 records and do some adjustments from there.

Jeremy
Speed: Varies, WebForms: Yes, ASP.NET: Yes, Wide Tables: Yes, Narrow tables, Yes. Thousands of rows: yes. Millions of rows: yes. If you can think of a scenario, I'm probably doing it.
Jonathan Allen
I have to stick by my previous answer then. I don't think there's a silver bullet.
Jeremy
+3  A: 

This is an issue I have also spent some time looking into. I am looking to optimize importing large CSV files (16+ GB, 65+ million records, and growing) into a SQL Server 2005 database using a C# console application (.Net 2.0). As Jeremy has already pointed out, you will need to do some fine-tuning for your particular circumstances, but I would recommend you have an initial batch size of 500, and test values both above and below this.

I got the recommendation to test values between 100 and 1000 for batch size from this MSDN forum post, and was skeptical. But when I tested for batch sizes between 100 and 10,000, I found that 500 was the optimal value for my application. The 500 value for SqlBulkCopy.BatchSize is also recommended here.

To further optimize your SqlBulkCopy operation, check out this MSDN advice; I find that using SqlBulkCopyOptions.TableLock helps to reduce loading time.

MagicAndi
+4  A: 

I have an import utility sitting on the same physical server as my SQL Server instance. Using a custom IDataReader, it parses flat files and inserts them into a database using SQLBulkCopy. A typical file has about 6M qualified rows, averaging 5 columns of decimal and short text, about 30 bytes per row.

Given this scenario, I found a batch size of 5000 to be the best compromise of speed and memory consumption. I started with 500 and experimented with larger. I found 5000 to be 2.5x faster, on average, than 500. Inserting the 6 million rows takes about 30 seconds with a batch size of 5000 and about 80 seconds with batch size of 500.

10000 was not measurably faster. Moving up to 50000 improved the speed by a few percentage points but it's not worth the increased load on the server. Above 50000 showed no improvements in speed.

This isn't a formula, but it's another data point for you to use.

Alric