tags:

views:

342

answers:

2

I use posix_spawnp to spawn child processes from my main process.

    int iRet = posix_spawnp(&iPID, zPath, NULL, NULL, argv, environ);   

    if (iRet != 0)
    {  
     return false;
    }

Sometimes, after a child process is spawned without errors, it suddenly becomes defunct. How could this occur?

I use a signal handler to reap child processes:

void SigCatcher(int n)
{       
    while(waitpid( -1, NULL, WNOHANG ) > 0);     
}

and I manually call it whenever I kill a child process.

    kill(oProcID, SIGKILL);

    signal (SIGCHLD, SigCatcher);

Could this cause spawned children to go defunct (without me calling kill)?

+2  A: 

This:

kill(oProcID, SIGKILL);

signal (SIGCHLD, SigCatcher);

looks like a race condition. You need to install the signal handler before killing the child process, otherwise you risk missing the signal.

unwind
Thanks. I've implemented the signal handler in a shared library. It does not get registered if I put "signal (SIGCHLD, SigCatcher)" in the constructor. Does this have to go in the main()?
Gayan
@Gayan: it is probably simply if you setup signal handlers as one of the first things the app does in main.
Evan Teran
the fact it's in a constructor shouldn't matter. but it should be in main anyway (see my answer)
Alnitak
I need to set this up with least changes to the "Main" application. The "Main" app does not have to handle SIGCHLD. The plugin that I'm working on, does. Is there a way to register the signal handler purely through the plugin? (as I've said, putting the reg. part in the constructor didn't work)
Gayan
there's nothing special about constructors. It should still work fine.
Alnitak
+1  A: 

Have you called:

signal(SIGCHLD, SigCatcher);

anywhere else?

If you haven't, then you need to do so before any child processes are even spawned to ensure that those children are reaped when they terminate.

As Unwind points out, your current calls to kill and signal are the wrong way around.

Typical use would be:

signal(SIGCHLD, handler);
posix_spawnp(...);
...
// do other stuff
...
kill(pid, SIGKILL);
Alnitak