views:

76

answers:

3

I have some daemons that use PID files to prevent parallel execution of my program. I have set up a signal handler to trap SIGTERM and do the necessary clean-up including the PID file. This works great when I test using "kill -s SIGTERM #PID". However, when I reboot the server the PID files are still hanging around preventing start-up of the daemons. It is my understanding that SIGTERM is sent to all processes when a server is shutting down. Should I be trapping another signal (SIGINT, SIGQUIT?) in my daemon?

+3  A: 

Not a direct solution but it might be a good idea to check for an actual process running with the pid in the pid file at startup and if none exists, to cleanup the stale file.

It's possible that your process is getting a SIGKILL before it has a chance to cleanup the pid file.

Noufal Ibrahim
+1: When the power fails, no process will be sent any signals, and you'll have the same kind of "PID file with no matching process". A better test than simply looking for a PID file is clearly indicated.
S.Lott
If the pid is used by another process, your daemon will think it is already running, so this is not guaranteed.
stinkypyper
stinkypipe. You can always check the name of the process that has the pid and double check. That's a trivial obstacle.
Noufal Ibrahim
+2  A: 

Remember that, after sending SIGTERM to all processes, the kernel wait some time (usually about 2 or 3 seconds), and then send SIGKILL. You can find that in /etc/rc.d/rc0.d/S01halt or similar (might vary depending on your distribution).

For example, on my Fedora 11 you have:

action $"Sending all processes the TERM signal..." /sbin/killall5 -15
sleep 2
action $"Sending all processes the KILL signal..."  /sbin/killall5 -9

So if you are not fast enough, either increase the delay, or make sure you are faster!

PierreBdR
+2  A: 

Use flock (or lockf) on your pidfile, if it succeeds, you can rewrite the pidfile and continue.

This SO answer has a good example on how this is done.

Hasturkun