views:

562

answers:

2

My company recently had a problem where we needed to update 64,000+ rows of data fairly regularly, where each row would be update with unique values.

I have come up with a good answer, which I am posting here for others' reference, but also to find feedback if there are other solutions.

Clarification : Each row has unique values that are being updated into it, which are calculated on the client, or set manually by the user.

A: 

Solution 1 (bad) : Send 64k update statements to the table. VERY VERY SLOW.

Solution 2 (bad) : Use the SqlDataAdapter batch processing mode to reduce round trips to the server. Better, but still very slow.

Solution 3a : Create a new table (or temporary table). Use SqlBulkCopy class to insert new rows into the temporary table containing the data needed to update the real table. Run a stored proc that loops over the temp table to do the updates.

Solution 3a : Do the bulk insert as above, but run a single update statement that joins the real table against the temporarily table, and does field updates. This is only viable if every row needs the same columns updated, but with different values.

Jason Coyne
If you have the PK of the table to be updated you could just run an UPDATE command instead of a stored proc, but perhaps I am misunderstanding the complexity of the update.
Nathan Koop
+1  A: 

This is only viable if every row needs the same columns updated, but with different values.

Huh?

I'm no sure how you did your bulk update, but it was one of two ways: either the new row contained new values for every column, in which case you can just unconditionally update each row.

Or only non-null columns need to be updated, in which case you can do

update tablea a join bulkupdate b on ( somekey )
set a.col1 = coalesce( b.col1, a.col1), 
a.col2 = coalesce( b.col2, a.col2)
etc.

of course, in the second scenario, no columns can be updated TO null, but if you filled your bulk update columns the second way, then you had no way, in the first place, to figure out which nulls meant "the data really is null" and which meant "this column isn't updated".

tpdi
The coalesce idea is very nice.
Jason Coyne