views:

166

answers:

2

Hi Guys,

I am having an ongoing and painful problem with an SSIS package. The package runs every 5 minutes as an SQL Agent Job and every 2-10 days the package will start running and never stop (thus preventing further executions). If I stop the hung job manually it will begin working perfectly again in the next 5 minute interval.

The SSIS package is for moving data from an Oracle database to a MSSQL 2005 database. It has 7 steps:

  • Step 1 calls an Oracle Stored Procedure to prepare the temporary tables inside ORACLE
  • Steps 2-6 process the data from the ORACLE tables to the MSSQL tables ORACLE -> MSSQL
  • Step 7 calls an Oracle Stored Procedure to clear the ORACLE temporary tables

I suspect that the issue is caused by a communications error between the MSSQL server and the ORACLE server. Both the MSSQL database and Agent/package run on one machine with the ORACLE database running over the network.

I have enabled logging of the SQL package and after more than 2GB of log file I have captured the instant where the package stops responding:


OnPreValidate,ADV-SRV5,NT AUTHORITY\SYSTEM,CallistaIntegrationToMonashCRM_delta,{F88F6C45-CFA2-4801-A2F2-DDF03D458A48},{3A1FB1E3-B76D-444D-876B-D1FBBB9BA246},6/06/2010 10:15:01 AM,6/06/2010 10:15:01 AM,0,0x,(null)
OnPreValidate,ADV-SRV5,NT AUTHORITY\SYSTEM,Address,{c5907799-f918-43da-818a-d4bd7f188367},{3A1FB1E3-B76D-444D-876B-D1FBBB9BA246},6/06/2010 10:15:01 AM,6/06/2010 10:15:01 AM,0,0x,(null)
OnInformation,ADV-SRV5,NT AUTHORITY\SYSTEM,Address,{c5907799-f918-43da-818a-d4bd7f188367},{3A1FB1E3-B76D-444D-876B-D1FBBB9BA246},6/06/2010 10:15:01 AM,6/06/2010 10:15:01 AM,1074016266,0x,Validation phase is beginning.

OnProgress,ADV-SRV5,NT AUTHORITY\SYSTEM,Address,{c5907799-f918-43da-818a-d4bd7f188367},{3A1FB1E3-B76D-444D-876B-D1FBBB9BA246},6/06/2010 10:15:01 AM,6/06/2010 10:15:01 AM,0,0x,Validating
Diagnostic,ADV-SRV5,NT AUTHORITY\SYSTEM,Callista,{cb5d6fe3-3ea4-4453-8e5a-965818021df7},{3A1FB1E3-B76D-444D-876B-D1FBBB9BA246},6/06/2010 10:15:01 AM,6/06/2010 10:15:01 AM,0,0x,ExternalRequest_pre: The object is ready to make the following external request: 'IDataInitialize::GetDataSource'.
Diagnostic,ADV-SRV5,NT AUTHORITY\SYSTEM,Callista,{cb5d6fe3-3ea4-4453-8e5a-965818021df7},{3A1FB1E3-B76D-444D-876B-D1FBBB9BA246},6/06/2010 10:15:01 AM,6/06/2010 10:15:01 AM,0,0x,ExternalRequest_post: 'IDataInitialize::GetDataSource succeeded'. The external request has completed.
Diagnostic,ADV-SRV5,NT AUTHORITY\SYSTEM,Callista,{cb5d6fe3-3ea4-4453-8e5a-965818021df7},{3A1FB1E3-B76D-444D-876B-D1FBBB9BA246},6/06/2010 10:15:01 AM,6/06/2010 10:15:01 AM,0,0x,ExternalRequest_pre: The object is ready to make the following external request: 'IDBInitialize::Initialize'.

These messages show the entire log generated for the failed run, for a successful run the output is typically ~2500 lines.

I can see that the package is hanging during the initialize operation on the Callista connection (ORACLE database).

I have not been able to work out a way to either fix this issue or have the package die gracefully (an error to the log would be A-OK with me).

Any help or advice would be greatly appreciated.

A: 

We had/have an issue like that where a SSIS pkg randomly hangs. There has been talk on creating a package the monitors the SSIS packages every 10-20 mins or so until all the packages in the particular batch have been processed and kills any hanging pkgs and notfies the on call developer in the event that this happens. You may want to mimic a similar approach.

rfonn
Thanks rfonn, I was afraid this was the case.I have created a separate job that monitors the problematic jobs to see if they have reached their 'Max Execution Time'.
Adam MacLeod