i wrote a program that needs to continuously run. but since im a bad programmer it crashes every so often. is there a way to have another program watch it and restart it when it crashes?
I was going to give some pseudocode on a process monitor but i guess someone already has done it. http://www.linuxjunkies.org/html/Process-Monitor-HOWTO.html
Not to be specious, but if you're a bad programmer, what's to say your watching programming won't fail too ;) And, you should get better so that you don't have this issue (for this reason). That said, you will probably have need of the following answer eventually.
However, if getting better isn't possible, just run a cron job at regular intervals looking for the name of your program in the output from 'ps'. And that answer you can get from superuser.com
No need for 3rd party programs
All of this can be accomplished with the linux inittab
Look for "respawn"
Since Stackoverflow is a programming site, let me give you an overview of how such a watcher would be implemented.
First thing to know is that your watcher will have to start the watched program yourself. You do this with fork
and exec
.
What you can then do is wait for the program to exit. You can use of the wait system calls (i.e. wait
, waitpid
or wait4
) depending on your specific needs. You can also catch SIGCHLD
so that you can get asynchronously informed when your child exits (you will then need to call wait
to get it's status).
Now that you have the status, you can tell if the process died due to a signal with the macro WIFSIGNALED
. If that macro returns true, then your program crashed and needs to be restarted.
It still won't continuously run if you have another task monitoring it... it will still have a short amount of down time while it restarts.
Additionally, if you are acting as a network (or local) server process, you'll lose any state about requests in progress; I hope this is ok (Of course your clients may have built-in timeout and retry).
Finally, if your process crashed while it was in the middle of storing any persistent data, I hope it has a mechanism of coping with half-written files, etc.
However, if you intend it to be robust, all of these things should be true anyway, so you can use something like supervisord safely.