views:

1173

answers:

3

I have 2 tables (~ 4 million rows) that I have to do insert/update actions on matching and unmatching records. I am pretty confused about the method I have to use for incremental load. Should I use Lookup component or new sql server merge statement? and will there be too much performance differences?

A: 

Premature optimization is the root of all evil, I don't know about ssis, but it's always to early to think about this.

4 million rows could be "large" or "small", depending on the type of data, and the hardware configuration you're using.

Osama ALASSIRY
A: 

The SSIS Lookup has three caching modes which are key to getting the best performance from it. If you are looking up against a large table, FULL Cache mode will eat up a lot of your memory and could hinder performance. If your lookup destination is small, keep it in memory. You've also got to decide if the data you are looking up against is changing as you process data. If it is, then you don't want to cache.

Can you give us some more info on what you are oding so I can formulate a more precise answer.

ChrisLoris
There's also a new feature in SSIS 2008, that allows you to cache lookup data, update the cache incrementally, and then reuse it as the cache.
John Saunders
A: 

I've run into this exact problem a few times and I've always had to resort to loading the complete dataset into SQLserver via ETL, and then manipulating with stored procs. It always seemed to take way, way too long updating the data on the fly in SSIS transforms.

Dayton Brown
Did you use SSIS 2008, and did you try using MERGE?
John Saunders