We're using a job scheduling system that runs on top of DBMS_JOB. It uses a master job to create one-time jobs. We deploy the same set of jobs to all our clients, but can specify which jobs should only run at certain clients.
We get occasional problems with a process run by a job hanging. The main cause of this is UTL_TCP not timing out when it does get an expected response. I want to be able to kill those jobs so that they can run again.
I'm looking at creating a new job that kill any of these one-time jobs that have been running for longer than a certain time.
We're stuck with Oracle 10g for a while yet, so I'm limited to what that can do.
There's an article that seems to cover most of this at
http://it.toolbox.com/blogs/database-solutions/killing-the-oracle-dbms_job-6498
I have a feeling that this is not going to cover all eventualities, including:
- We run can jobs as several different users and a user can only break/remove jobs they created. I believe that I may be able to use DBMS_IJOB to get around that, but I need to get the DBA to let me execute it.
- We have Oracle RAC systems. I understand 10g limits ALTER SYSTEM KILL SESSION to killing sessions on the current instance. I could arrange for all jobs to run on the same instance, but I've not tried that yet.
Anything else I should consider? Stack Overflow needs a definitive answer on this.