views:

699

answers:

4

I am running a SSIS package to load say a million rows from a flat file, which uses a script task for complex transformations and a SQL Server table destination. I am trying to figure out the best way (well, ANY way at this stage) to write out to a different table the row count (probably in multiples of 1000 to be more efficient) DURING the data flow processing. This is so that I can determine the percentage of progress throughout a task that might take a few minutes, simply by querying the table periodically.

I can't seem to add any SQL task into the flow, so I'm guessing the only way is to connect to the SQL database inside the .NET script. This seems painful and I'm not even sure it is possible. Is there another more elegant way? I've seen reference to "Rows Read" performance counter but not sure where I access this in SSIS and still not sure how to write it to a SQL table during the Data Flow processing.

Any suggestions appreciated.

Glenn

A: 

Why not write a .NET application and you can integrate into that to get information as to where the SSIS package is at.

Basically everything that is sent to the console you can get, and there are event handlers you can attach to to get information while it is processing the package.

Here is a link that may help you to go with this approach: http://www.programminghelp.com/database/sqlserver/sql-server-integration-services-calling-ssis-package-in-c/

James Black
+1  A: 

there are two easy options here:

Option 1: use the built-in logging with SSIS and watch the on progress event. this can be configured to log to several different outputs including relational database and flat files

See more Here

Option 2: you could add a SSIS script component that could fire off notifications to an external system like a database table

JasonHorner
Well, his question is how to do that. Isn't it?
Faiz
Jason, thanks. I've looked into the logging. Unfortuantely the OnProgress event fires ONCE when the dataflow starts and then never again. I can't see another event that fires on a per row or in some other way constantly throughout the dataflow. I'm beginning to think option 2 is a possibility. Can you suggest some code to make the connection. The catch is that in a dataflow there is a 'stipped back' object model. Things like the Dts. object simply don't exist like they do in the Control flow. Useful, I know. So getting the database connection to work is difficult. Any suggestions?
Glenn M
See latest update below...
Glenn M
A: 

Is the application consuming the row count a .net application? When it comes to sharing information between applications there are a lot of accepted practices. May be you should take a look in to them. And for your particular case, if it is .net application that consumes this row number for calculating progress, may be you can store the information some place else other than a DB table, like file system, web service, windows environment variables, log (like windows events log), etc are some that came to my mind now. I think updating a windows environment variable with row count form with in your script component will be a good enough solution. Just like using a global variable to share data between two functions inside a program. :)

Faiz
A: 

OK, had some success at last.... added a call to the following sub in the script component:

Sub UpdateLoadLog(ByVal Load_ID As Int32, ByVal Row_Count As Int32, ByVal Row_Percent As Int32, ByVal connstr As String)
    Dim dbconn As OleDbConnection
    Dim Sql As String
    Dim dbcomm As OleDbCommand

    dbconn = New OleDbConnection(connstr)
    dbconn.Open()
    Sql = "update myTable set rows_processed = " & Row_Count & ", rows_processed_percent = " & Row_Percent & " where load_id = " & Load_ID & " and load_log_type = 'SSIS'"
    dbcomm = New OleDbCommand(Sql, dbconn)
    dbcomm.ExecuteNonQuery()

    dbconn.Close()
    dbconn = Nothing
    dbcomm = Nothing
End Sub

This gets executed every 1000 rows, and successfully updates the table. The row already existed as it gets created in the control flow at the start of the package, and updated again in the control flow at the very end with final rowcount and 100%.

Thanks for all your suggestions guys.

Glenn M

related questions