views:

809

answers:

2

I am wondering how SSIS deals with transactions with large data sets. I have a 'large' data set of about 150,000 rows, each of which needs to be validated against business rules as part of an ETL from a staging database to the live database.

If any of the records fail their business rules, no records should end up in the live database (i.e. rollback the transaction).

My question is how SSIS handles large transactions - or can it? Will it handle entering 149,999 records then roll the whole lot back if the last record fails its busiless rules? Or is there a better best-practice for performing this type of large data transfer operation?

My current thinking is to process each record within a sequence container at the control flow level, with the transcation settings enabled on the container. All validations will be done within the sequence container, and the insert will also be done in this container.

+2  A: 

SSIS can handle transactions reasonably well. Where it falls down is having transactions against multiple DBs or servers using MSDTC (still works, but there's some caveats).

You can set the TransactionOption on the Data Flow task to be Required. This will force SSIS to enroll it into a transaction. Likewise, you can put the TransactionOption of other tasks to be NotSupported (e.g. update status table to fail if it fails).

You can't actually process each record within a sequence container (unless you loop over a data flow task multiple times which probably not the best way). Instead, I'd put the Data Flow to transactionoption required, and then it should fail the task if any records fail in your data flow and rollback the whole transaction.

Dane
A: 

I agree with Dane, SSIS and SQL server should have no problems with this amount of transactions provided there is appropriate infrastructure to support this.

nullptr

related questions