ETL as a transaction

views:

154

answers:

ETL as a transaction

For all the ETLs I have written so far, I have never made them transactions - i.e. if table 4 fails, roll everything back.

What is the best practice in this regard?

To "BeginTran + Commit" or not to "BeginTran + Commit"

EDIT: I have one master package calling 4 other packages - is it possible to roll them all up into one transaction?

+1 A:

In SSIS, I always Begin Trans + Commit. I want to make sure that I can re-run the package without issue (or having to find what rows actually got inserted) if it fails.

It just makes recovery and cleanup so much easier.

Eric 2009-07-31 19:27:29

+1 A:

begin+commit in manageable batch sizes. You don't want to wrap a 6 hours import into a single transaction every night. Keep your batches at a size that can finish in 2-3 minutes at most. That you will hit data purity issues that will fail an ETL is a given, so at least reduce the impact to something manageable (ie. don't trigger a rollback that will last another 6 hours to complete).

Remus Rusanu 2009-07-31 19:39:18

+1 A:

You are often moving too much data in ETL to use a SQL transaction (the log has to store ALL the data to roll back, remember). I prefer to design packages such that they can be re-run nondestructively. Ideally they should be set up so that if they die in mid-stream, you can just start them and they'll continue somewhere approximately where they left off. Sometimes there's a performance penalty for that, but I think it's worth it.

Technically you can roll packages up into a single transaction; practically, maybe not.

onupdatecascade 2009-07-31 20:19:46

ansaurus

tags:

views:

answers:

ETL as a transaction

related questions