views:

223

answers:

4

I have a badly behaving process (launched via a user command) which keeps dying at erratic intervals, and I need it to stay alive till I manually kill it. Here is my straight, but probably stupid solution:

#!/bin/bash

if [ -z $1 ]
then
 echo "Usage: /s98ize.sh <process name>"
 exit
fi

#start of the 'polling' loop

while [ 1 ]
do

pgrep $1

if [ $? -eq 0 ]
then
 echo "Already running"
else
 # If process has died or not started, start it
 $1
 # FIXME: I have not done any error checking this script will not catch a
 # unavailable command
fi

done

# end of the polling loop

The gist is: if the above process is running, then do 'nothing', else launch it. A very straightforward disadvantage is that it keeps 'polling'. However, it serves my purpose.

As I write this, I think I can do a signal handling on the process, so that once it gets the kill signal, I can restart it? What do you think?

+4  A: 

You can put it in /etc/inittab and init(8) will restart it for you automatically.

You can define in which runlevels the process runs, so it does not have to start at boot. You can also use a program like sed(1) to edit inittab programmatically and add a line to it, and then tell init(8) to reaload the config file (and start your program) with: init q

Milan Babuškov
I don't necessarily want to start the process at boot time. In that case, I think your suggestion won't help. Please correct me if I am wrong.
Amit
I updated my answer to cover this issue as well.
Milan Babuškov
+2  A: 

Have you considered djb's supervise program? It does exactly this: run a program, restart it if it exits, provide a means to control it, etc?

Jeffrey Hantin
+1  A: 

Instead of curing the symptom you should try to fix the problem. By that I mean find out why the program "is dying" (crashes) and fix it if possible (most Linux programs are open source and enable you to exactly do that).

To find the reason for the program (wvdial) is failing you can do this:

Use ulimit -c unlimited in the shell where you start wvdial, so that if it crashes it will generate a core file, then debug it with gdb --core /path/to/wvdial

You may need to install the debug information for the apps/libs with your package manager first if they are not already installed.

If you can't (or won't) do that then you may be able to use monit to automatically restart your process. Here's a blog that shows how to use monit for a web server.

Monit is a free open source utility for managing and monitoring, processes, files, directories and filesystems on a UNIX system. Monit conducts automatic maintenance and repair and can execute meaningful causal actions in error situations.

lothar
Oh no. Monit seems to me a overkill for something like badly behaving 'wvdial'!
Amit
I will try out your other suggestion.
Amit
+1  A: 

If you want to respawn a process using a bash script, don't make the mistake of relying on broken tools such as pgrep. Moreover, your bash code suffers badly from wordsplitting and unexpected pathname expansion bugs.

Do this:

#!/usr/bin/env bash

until "$@"; do
    echo "$1 exited with exit code: $?.  Respawning .."
    sleep 1
done

The sleep is there to avoid processes that die instantly from causing an infinite loop that'll suck your CPU dry.

Also notice the use of "$@".

The until keyword will keep restarting your process until it exits cleanly (with an exit code of 0), which means it exited without bugging out (probably because you asked it to stop, eg, when rebooting the system).

Assuming it's called 'respawn' and is in PATH, use it like so:

respawn mycommand --foo=bar
lhunath