tags:

views:

1217

answers:

2

In my program I am forking (in parallel) child processes in a finite while loop and doing exec on each of them. I want the parent process to resume execution (the point after this while loop ) only after all children have terminated. How should I do that?

i have tried several approaches. In one approach, I made parent pause after while loop and sent some condition from SIGCHLD handler only when waitpid returned error ECHILD(no child remaining) but the problem I am facing in this approach is even before parent has finished forking all processes, retStat becomes -1

    void sigchld_handler(int signo) {
        pid_t pid;
        while((pid= waitpid(-1,NULL,WNOHANG)) > 0);
        if(errno == ECHILD) {
         retStat = -1;
        }
    }

    **//parent process code**
    retStat = 1;
    while(some condition) {
       do fork(and exec);
    }

    while(retStat > 0)
        pause();
//This is the point where I want execution to resumed only when all children have finished
A: 

I think you should use the waitpid() call. It allows you to wait for "any child process", so if you do that the proper number of times, you should be golden.

If that fails (not sure about the guarantees), you could do the brute-force approach sitting in a loop, doing a waitpid() with the NOHANG option on each of your child PIDs, and then delaying for a while before doing it again.

unwind
Please see the code, I have done exactly the same thing that u r saying.
avd
@aditya: Uh, no, my suggestion means to do as jborque suggests, I didn't say anything about using a signal handler.
unwind
+4  A: 

Instead of calling waitpid in the signal handler, why not create a loop after you have forked all the processes as follows:

while (pid = waitpid(-1, NULL, 0)) {
   if (errno == ECHILD) {
      break;
   }
}

The program should hang in the loop until there are no more children. Then it will fall out and the program will continue. As an additional bonus, the loop will block on waitpid while children are running, so you don't need a busy loop while you wait.

You could also use wait(NULL) which should be equivalent to waitpid(-1, NULL, 0). If there's nothing else you need to do in SIGCHLD, you can set it to SIG_IGN.

jbourque
I need to give WNOHANG option because if there are many children who deliver SIGCHLD signal almost at the same time, and since signals are not queued in UNIX, if I use 0, then I will be able to catch only one signal then zombies will be left around.
avd
Just to add a bit, the -1 will check against any child pid (could use WAIT_ANY for clarity also).
amischiefr
@aditya: The zombie processes are just waiting for you to call wait() (or wait_pid()) on them. As soon as you do, they'll disappear, whether or not you've caught the signal. So the wait() loop after the fork() loop will mop up all the zombies.
jbourque
Agreed -- if you are using `wait/waitpid` then you don't need to handle the `SIGCHLD` yourself.
mobrule