tags:

views:

511

answers:

3

I have a few servers and other daemons I need to start up in the right sequence.

I have created init.d scripts from the skeleton script, and can install them to start in proper sequence using the numbered naming system, but a few issues remain:

One server ('serverA') needs to initialize a database connection, and then listen on a socket. Another server ('serverB') then needs to connect to that socket, and the connection will fail if the prior process is not yet listening. Is there a way to prevent the init.d script for serverA from terminating until serverA has started listening? The serverB init will not start until the serverA init has terminated.

Right now, the setup works by having serverB just retry the connection until it succeeds, but that approach seems fragile. I would like a more deterministic understanding of how to force the sequencing.

+1  A: 

I don't think it is fragile -- at least I can think of scenario's where it won't be fragile. Have a retry time of 5 seconds and its not bad at all. its a KISS approach and there aren't any corner cases which you don't understand.

Getting a distributed environment synchronized is not for the faint of heart, and its overkill in your example.

To give you some confidence in your approach I can tell you that I have dozens of hand-written complex server processes distributed over a web-farm, they have never given me any grief even when database servers have disappeared, or when network trunks have gone down etc. They simply keep running in degraded mode till the databases come back.

Hassan Syed
+1  A: 

If the server listens on a domain socket, you could build a loop that polls for the socket. There might be an easier way to do this in bash, but it could look something like:

for i in 1 2 3 4 5; do
  if [ -e '/var/run/myserver.sock' ]; then
    break
  fi
done

Another solution is to have your server not daemonize until it has opened the listening socket. That way, the init script will pause until the process daemonizes, which guarantees the socket is available.

Of course, this depends on your application doing the daemonization itself, rather than through some other means. ("/usr/bin/myserver &" or the like.)

Updated:

Also note that, what you are doing now is all System-V style init. Ubuntu actually uses Upstart, which is an event-based system rather than a sequence of scripts. You could opt for using upstart jobs rather than System-V init scripts, and firing a custom Upstart event from your server, which will trigger the launch of your second server.

The Getting Started guide has an example of this at the very bottom. I don't know if there's an API way, but it could just be a matter of a “system("/bin/initctl emit myevent");” at the right point in time in your first server. Someone else with more Upstart experience may be able to elaborate better/further.

Shtééf
pduel
You can remove the socket before launching the server, so it will have to recreate it. (I also forgot, the loop probably needs a short sleep too.)
Shtééf
A: 

Yes, this is my question that I am answering, but I found this technique to be useful, and am sharing for anybody else struggling with the similar problems.

I have found socat to be very useful in waiting for a socket or port. An init.d script like:

case “$1″ in
  start)
  echo '--benign phrase' | socat - UNIX-CONNECT:/path/to/socket,retry=10,intervall=1
;;

will wait until the socket becomes writable, then return. There is no daemonization involved, so it blocks execution of higher numbered init.d scripts until it finishes.

Use of such waiter scripts will slow the boot sequence, and are thus non-optimal, but are a big improvement from the very fragile approach of sprinkling 'sleep n' statements in the scripts.

pduel