views:

330

answers:

4

I need to insert millions of records being read from disk into SQL server. I am parsing these from a file on one machine, in what is a single-threaded process.

Ideally I would want it to perform well if the SQL server is local or remote. It has to be done programatically in C#.

A: 

Try bulk copy. You can use the SqlBulkCopy class to write a little app to read your data in memory and load it into SQL Server.

CesarGon
A: 

The bcp utility is lightning fast and configurable. While it is a standalone utility, it runs from the command line and can certainly be summoned from c#.

Chris Clark
The main issue I see with this is that I need to read in the data from disk, parse it, and then write it out to a file, and then calling bcp. Anything to go straight from memory->sql?
esac
+6  A: 

Fastest way is using SSIS with parallel reads, NUMA affinitizes clients, partitions writes and switch all partitions into a single table at the end. This will load you more than 2 TB per hour.

If you have a suitable text file then probably bulk copy utility.

If you want to insert from your process, then you can either use the SqlBulkCopy.WriteToServer but you have to present the data as an IDataReader, or you can use straigh SqlCommand inserts. With later, if you batch inserts commits, you'll achieve a good trhoughput. The usual bottleneck is the log flush on single statement commits.

Remus Rusanu
I am so glad I didn't try answer this question... I'm sure its not practical in his situation (my guess is he'll end up using bulk copy) but wow, quite the "fastest" way there...
LorenVS
It's tru that is not 'practical', but is useful info as a reference to know what is possible. Once you see how the pro's do it, you can find innovative ways to apply it to the task at hand.
Remus Rusanu
I got all excited just thinking about doing that... :D Awesome answer...!!!
KSimons
Followup Question. Any idea if just a SqlBulkCopy.WriteToServer on a single thread is sufficient to hit a bottleneck, or would it be better to put these on multiple threads.
esac
Bulk copy operations are mutually exclusive as they acquire table level locks. Multiple threads can only write bulk copy into separate tables.
Remus Rusanu
A: 

Depends on your input file format (1) it is suited for bulk copy, so, use it, (2) it is not suited for bulk copy or you need extra processing/checks/etc.. on the server-side ; so DO jump in with multi-threaded inserts loops, commit every 1000 row or more, and possibly use array inserts

Antibarbie