views:

351

answers:

3

This post describes how to keep a child process alive in a BASH script:

http://stackoverflow.com/questions/696839/how-do-i-write-a-bash-script-to-restart-a-process-if-it-dies

This worked great for calling another BASH script.

However, I tried executing something similar where the child process is a Python script, daemon.py which creates a forked child process which runs in the background:

#!/bin/bash

PYTHON=/usr/bin/python2.6

function myprocess {


$PYTHON daemon.py start

}
NOW=$(date +"%b-%d-%y")

until myprocess; do
     echo "$NOW Prog crashed. Restarting..." >> error.txt
     sleep 1
done

Now the behaviour is completely different. It seems the python script is no longer a child of of the bash script but seems to have 'taken over' the BASH scripts PID - so there is no longer a BASH wrapper round the called script...why?

+5  A: 

A daemon process double-forks, as the key point of daemonizing itself -- so the PID that the parent-process has is of no value (it's gone away very soon after the child process started).

Therefore, a daemon process should write its PID to a file in a "well-known location" where by convention the parent process knows where to read it from; with this (traditional) approach, the parent process, if it wants to act as a restarting watchdog, can simply read the daemon process's PID from the well-known location and periodically check if the daemon is still alive, and restart it when needed.

It takes some care in execution, of course (a "stale" PID will stay in the "well known location" file for a while and the parent must take that into account), and there are possible variants (the daemon could emit a "heartbeat" so that the parent can detect not just dead daemons, but also ones that are "stuck forever", e.g. due to a deadlock, since they stop giving their "heartbeat" [[via UDP broadcast or the like]] -- etc etc), but that's the general idea.

Alex Martelli
A: 

It seems that the behavior is completely different because here your "daemon.py" is launched in background as a daemon.

In the other link you pointed to the process that is surveyed is not a daemon, it does not start in the background. The launcher simply wait forever that the child process stop.

There is several ways to overcome this. The classical one is the way @Alex explain, using some pid file in conventional places.

Another way could be to build the watchdog inside your running daemon and daemonize the watchdog... this would simulate a correct process that do not break at random (something that shouldn't occur)...

kriss
+2  A: 

You should look at the Python Enhancement Proposal 3143 (PEP) here. In it Ben suggests including a daemon library in the python standard lib. He goes over LOTS of very good information about daemons and is a pretty easy read. The reference implementation is here.

Paul Hildebrandt
Thanks for the reference Paul. Looks useful.
norm