views:

755

answers:

4

Are there any ways to determine what the differences in databases are that affect a SSIS package load performance ?

I've got a package which loads and does various bits of processing on ~100k records on my laptop database in about 5 minutes

Try the same package and same data on the test server, which is a reasonable box in both CPU and memory, and it's still running ... about 1 hour so far :-( Checked the package with a small set of data, and it ran through Ok

A: 

In my experience the biggest performance factor in SSIS is Network Latency. A package running locally on the server itself runs much faster than anything else on the network. Beyond that I can't think of any reasons why the speed would be drastically different. Running SQL Profiler for a few minutes may yield some clues there.

Rob
Interesting you reckon network latency is the biggest performance hit ... the test environment is two servers, one with package, one with SQL, so it's going over the network ... could that have such a big hit ?
SteveC
+1  A: 

If you've ruled out network latency, your most likely culprit (with real quantities of data) is your pipeline organisation. Specifically, what transformations you're doing along the pipeline.

Data transformations come in four flavours:

  • streaming (entirely in-process/in-memory)
  • non-blocking (but still using I/O, e.g. lookup, oledb commands)
  • semi-blocking (blocks a pipeline partially, but not entirely, e.g. merge join)
  • blocking (blocks a pipeline until it's entirely received, e.g. sort, aggregate)

If you've a few blocking transforms, that will significantly mash your performance on large datasets. Even semi-blocking, on unbalanced inputs, will block for long periods of time.

Jeremy Smyth
Could you explain how to rule out network latency, as it's completely new to me?
SteveC
A: 

CozyRoc over at MSDN forums pointed me in the right direction ...
- used the SSMS / Management / Activity Monitor and spotted lots of TRANSACTION entries
- got me thinking, read up on the Ole Db connector and unchecked the Table Lock
- WHAM ... data loads fine :-)

Still don't understand why it works fine on my laptop d/b, and stalls on the test server ?
- I was the only person using the test d/b, so it's not as if there should have been any contention for the tables ??

SteveC
Steve - I've been battling performance problems on SSIS for a month now. While inconclusive, it seems as though SSIS doesn't perform as well on x64 servers as it does on 32-bit machines. A case study of a 200GB Oracle -> SQL load, a blog on MSDN, and my own findings all point towards the same thing... OLE DB on x64 has got some performance flaws. Run SSIS *on* the database server, and watch how network I/O screams up even when using named pipes connections.
Mark
+1  A: 

I've had similar problems over the past few weeks, and here are several things you could consider, listed in decreasing order of importance according to what made the biggest difference for us:

  1. Don't assume anything about the server. We found that our production server's RAID was miscconfigured (HP sold us disks with firmware mismatches) and the disk write speed was literally a 50th of what it should be. So check out the server metrics with Perfmon.

  2. Check that enough RAM is allocated to SQL Server. Inserts of large datasets often require use of RAM and TempDB for building indices, etc. Ensure that SQL has enough RAM that it doesn't need to swap out to Pagefile.sys.

  3. As per the holy grail of SSIS, avoid manipulating large datasets using T-SQL statements. All T-SQL statements cause changed data to write out to the transaction log even if you use Simple Recovery Model. The only difference between Simple and Full recovery models is that Simple automatically truncates the log file after each transactions. This means that large datasets, when manipulated with T-SQL, thrash the log file, killing performance.

  4. For large datasets, do data sorts at the source if possible. The SSIS Sort component chokes on reasonably large datasets, and the only viable alternative (nSort by Ordinal, Inc.) costs $900 for a non-transferrable per CPU license. So... if you absolutely have to a large dataset then consider loading it into a staging database as an intermediate step.

  5. Use the SQL Server Destination if you know your package is going to run on the destination server, since it offers roughly 15% performance increase over OLE DB because it shares memory with SQL Server.

  6. Increase the network packaet size to 32767 on your database connection managers. This allows large volumes of data to move faster from the source server/s, and can noticably improve reads on large datasets.

  7. If using Lookup transforms, experiment with cache sizes - between using a Cache connection or Full Cache mode for smaller lookup datasets, and Partial / No Cache for larger datasets. This can free up much needed RAM.

  8. If combining multiple large datasets, use either RAW files or a staging database to hold your transformed datasets, then combine and insert all of a table's data in a single data flow operation, and lock the destination table. Using staging tables or RAW files can also help relive table locking contention.

  9. Last but not least, experiment with the DefaultBufferSize and DefaulBufferMaxRows properties. You'll need to monitor your package's "Buffers Spooled" performance counter using Perfmon.exe, and adjust the buffer sizes upwards until you see buffers being spooled (paged to disk), then back off a little.

Point 8 is especially important on very large datasets, since you can only achieve a minimally logged bulk insert operation if:

  • The destination table is empty, and
  • The table is locked for the duration of the load operation.
  • The database is in Simply / Bulk Logged recovery mode.

This means that subesquent bulk loads a table will always be fully logged, so you want to get as much data as possible into the table on the first data load.

Finally, if you can partition you destination table and then load the data into each partition in parallel, you can achieve up to 2.5 times faster load times, though this isn't usually a feasible option out in the wild.

Mark