views:

498

answers:

4

I've inherited an SSIS package which loads 500K rows (about 30 columns) into a staging table.

It's been cooking now for about 120 minutes and it's not done --- this suggests it's running at less than 70 rows per second. I know that everybody's environment is different but I think this is a couple orders of magnitude off from "typical".

Oddly enough the staging table has a PK constraint on an INT (identity) column -- and now I'm thinking that it may be hampering the load performance. There are no other constraints, indexes, or triggers on the staging table.

Any suggestions?

---- Additional information ------
The source is a tab delimited file which connects to two separate Data Flow Components that add some static data (the run date, and batch ID) to the stream, which then connects to an OLE DB Destination Adapter

Access mode is OpenRowset using FastLoad

FastLoadOptions are TABLOCK,CHECK_CONSTRAINTS

Maximum insert commit size: 0

A: 

If it is running in one big transaction, that may explain things. Make sure that a commit is done every now and then.

You may also want to check processor load, memory and IO to rule out resource issues.

stili
+2  A: 

I'd say there is a problem of some sort, I bulk insert a staging table from a file with 20 million records and more columns and an identity field in far less time than that and SSIS is supposed to be faster than SQL Server 2000 bulk insert.

Have you checked for blocking issues?

HLGEM
A: 

This is hard to say.

I there was complex ETL, I would check the max number of threads allowed in the data flows, see if some things can run in parallel.

But it sounds like it's a simple transfer.

With 500,000 rows, batching is an option, but I wouldn't think it necessary for that few rows.

The PK identity should not be an issue. Do you have any complex constraints or persisted calculated columns on the destination?

Is this pulling or pushing over a slow network link? Is it pulling or pushing from a complex SP or view? What is the data source?

Cade Roux
+2  A: 

I’m not sure about the etiquette of answering my own question -- so sorry in advance if this is better suited for a comment.

The issue was the datatype of the input columns from the text file: They were all declared as “text stream [DT_TEXT]” and when I changed that to “String [DT_STR]” 2 million rows loaded in 58 seconds which is now in the realm of “typical” -- I'm not sure what the Text file source is doing when columns are declared that way, but it's behind me now!

Ralph Shillington
Thanks for sharing the reason.
Tomalak