views:

485

answers:

5

I have a list of objects, this list contains about 4 million objects. there is a stored proc that takes objects attributes as params , make some lookups and insert them into tables.

what s the most efficient way to insert this 4 million objects to db?

How i do :

-- connect to sql - SQLConnection ...

foreach(var item in listofobjects)
{    
   SQLCommand sc = ...

   // assign params

   sc.ExecuteQuery();    
}

THis has been really slow.

is there a better way to do this?

this process will be a scheduled task. i will run this ever hour, so i do expect high volume data like this.

+7  A: 

Take a look at the SqlBulkCopy Class

based on your comment, dump the data into a staging table then do the lookup and insert into the real table set based from a proc....it will be much faster than row by row

SQLMenace
ok just a quick note, the stored proc. is doing some lookups, it s not doing straight insert.
The normal way to fix that problem is to do your lookups prior to insert. It's not perfectly transactional, but if you're inserting 4 million records you really shouldn't be doing it in a way that requires transactional isolation if you can avoid it at all.
Mike Burton
+1: This is by far the best answer. I have used this technique on multiple occasions exactly as you have described even down to the staging table and stored procedure. It works quite well.
Brian Gideon
The staging table is better again, assuming the lookups aren't heavy enough to cause a problem while processing the transfer.
Mike Burton
this indeed worked nicely, i did the lookups after loading the lookup tables to memory. thanks.
A: 

It's never going to be ideal to insert four million records from C#, but a better way to do it is to build the command text up in code so you can do it in chunks.

This is hardly bulletproof, and it doesn't illustrate how to incorporate lookups (as you've mentioned you need), but the basic idea is:

// You'd modify this to chunk it out - only testing can tell you the right
// number - perhaps 100 at a time.

for(int i=0; i < items.length; i++) {

    // e.g., 'insert dbo.Customer values(@firstName1, @lastName1)'
    string newStatement = string.Format(
        "insert dbo.Customer values(@firstName{0}, @lastName{0})", i);
    command.CommandText += newStatement;

    command.Parameters.Add("@firstName" + i, items[i].FirstName);
    command.Parameters.Add("@lastName" + i, items[i].LastName);
}
// ...
command.ExecuteNonQuery();
Jeff Sternal
@Jeff Sternal: sorry... my comment is meaningless when I look again. Will delete it.
gbn
@gbn - no harm done!
Jeff Sternal
+1  A: 

You might consider dropping any indexes you have on the table(s) you are inserting into and then recreating them after you have inserted everything. I'm not sure how the bulk copy class works but if you are updating your indexes on every insert it can slow things down quite a bit.

Abe Miessler
A: 
  1. Like Abe metioned: drop indexes (and recreate later)
  2. If you trust your data: generate a sql statement for each call to the stored proc, combine some, and then execute.
    This saves you communication overhead.
  3. The combined calls (to the stored proc) could be wrapped in a BEGIN TRANSACTION so you have only one commit per x inserts

If this is a onetime operation: do no optimize and run it during the night / weekend

GvS
+1  A: 

I have had excellent results using XML to get large amounts of data into SQL Server. Like you, I initially was inserting rows one at a time which took forever due to the round trip time between the application and the server, then I switched the logic to pass in an XML string containing all the rows to insert. Time to insert went from 30 minutes to less that 5 seconds. This was for a couple of thousand rows. I have tested with XML strings up to 20 megabytes in size and there were no issues. Depending on your row size this might be an option.

The data was passed in as an XML String using the nText type.

Something like this formed the basic details of the stored procedure that did the work:

CREATE PROCEDURE XMLInsertPr( @XmlString ntext )
DECLARE @ReturnStatus int, @hdoc int

EXEC @ReturnStatus = sp_xml_preparedocument @hdoc OUTPUT, @XmlString
IF (@ReturnStatus <> 0)
BEGIN
RAISERROR ('Unable to open XML document', 16,1,50003)
RETURN @ReturnStatus
END

INSERT INTO TableName
SELECT * FROM OPENXML(@hdoc, '/XMLData/Data') WITH TableName
END

John Dyer
but, what about lookup, i m passing some ids that should be looked up from other tables, there are about 7 lookups.
Sometimes I insert the XML data into a temp table then do any lookups needed and add that to the temp table. After the temp table is complete it all gets inserted into the main table. This speeds up the insert into the main table so as to reduce locking issues.Alternatively if the lookups are quick, some additional SQL logic on the insert statement can fetch this on the fly. It all depends on your comfort with SQL: )
John Dyer
the dba will make a beef with me because of this :)