views:

60

answers:

5

This is in regards to MS SQL Server 2005.

I have an SSIS package that validates data between two different data sources. If it finds differences it builds and executes a SQL update script to fix the problem. The SQL Update script runs at the end of the package after all differences are found.

I'm wondering if it is necessary or a good idea to some how break down the sql update script into multiple transactions and whats the best way to do this.

The update script looks similar to this, but longer (example):

 Update MyPartTable SET MyPartGroup = (Select PartGroupID From MyPartGroupTable
 Where PartGroup = "Widgets"), PartAttr1 = 'ABC', PartAttr2 = 'DEF', PartAttr3 = '123' 
 WHERE PartNumber = 'ABC123';

For every error/difference found an additional Update query is added to the Update Script. I only expect about 300 updates on a daily basis, but sometimes there could be 50,000. Should I break the script down into transactions every say 500 update queries or something?

+1  A: 

It shouldn't be a problem to split things up. However, if you want to A. maintain consistency between the items, and/or B. perform slightly better, you might want to use a single transaction for the while thing.

BEGIN TRANSACTION;
//Write 500 things
//Write 500 things
//Write 500 things
COMMIT TRANSACTION;

Transactions exist for just this reason -- where program logic would be clearer by splitting up queries but where data consistency between multiple actions is desired.

Billy ONeal
+2  A: 

No, I think the statement is fine as it is. It won't make much a of a difference in speed at all. Billy Makes a valid point if you do care about the readability of the query(you should if it is a query that will be seen or used in the future.).

Eric
+2  A: 

don't optimize anything before you know there is a problem. if it is running fast, let it go. if it is running slow, make some changes.

KM
+1: For recognizing premature optimization. Err - out of votes, but I shall return!
OMG Ponies
+1  A: 

Would your system handle other processes reading the data that has yet to be updated? If so, you might want to perform multiple transactions.

The benefit of performing multiple transactions is that you will not continually accumulate locks. If you perform all these updates at once, SQL Server will eventually run out of small-grained lock resources (row/key) and upgrade to a table lock. When it does this, nobody else will be able to read from these tables until the transaction completes (unless they use dirty reads or are in snapshot mode).

The side effect is that other processes that read data may get inconsistent results.

So if nodoby else needs to use this data while you are updating, then sure, do all the updates in one transaction. If there are other processes that need to use the table, then yes, do it in chunks.

Paul Williams
This will run when no one is using the system, but thank you for that explanation. For future knowledge, when you say do it in chunks are you saying to have multiple transactions?Something Like: BEGIN TRANSACTION;//Write 500 thingsCOMMIT TRANSACTION;BEGIN TRANSACTION;//Write 500 thingsCOMMIT TRANSACTION;BEGIN TRANSACTION;//Write 500 thingsCOMMIT TRANSACTION;
MaxGeek
Yes, I meant perform your updates in multiple transactions if you want to avoid table locks. Another consideration: if you update indexed columns, every update will trigger updates to the indexes. If so, after all these updates, you might want to reindex the table and update statistics on it.
Paul Williams
+1  A: 

All records affected by the query will be either locked or copied into tempdb if the transaction operates in SNAPSHOT isolation level.

IF the number of records is high enough, the locks may be escalated.

If transaction isolation level is not SNAPSHOT, then a concurrent query will not be able to read the locked records which may be a concurrency problem for your application.

If transaction isolation level is SNAPSHOT, then tempdb should contain enough space to accomodate the old versions of the records, or the query will fail.

If either of this is a problem for you, then you should split the update into several chunks.

Quassnoi