views:

31

answers:

1

As part of a Linux benchmark application, I have a parent process that forks multiple children that will each perform a task in parallel. I'm using signals to coordinate between them since I'm looking for as accurate of timing as possible. Each child will prepare for the test, then enter a 'barrier' controlled by the parent via signals.

Once all the children have entered the barrier, the parent records the time stamp, and signals the child processes to begin. As soon as the child finishes each portion of the test they signal the parent before entering the next barrier. The parent is listening for these signals and once it receives them from all of the child processes, it records the completion time(s).

My problem is that the program terminates non-deterministically; the signals don't always get delivered. The signal handler could not be any simpler:

void sig_child_ready (int sig)
{
    num_child_ready++;
}

num_child_ready is declared as volatile sig_atomic_t. I've tried using sigprocmask without success in a loop like this:

sigprocmask (SIG_BLOCK, &mask, &oldmask);
while (num_child_ready < num_child)
{
    /* waiting for child signals here */
    sigsuspend (&oldmask);
}
sigprocmask (SIG_UNBLOCK, &mask, NULL);

I'm not sure how to proceed from here. Am I correct that sigprocmask is needed to 'queue' the signals so they are processed one by one?

Or, consider this hypothetical scenario: the parent receives a signal, is executing its handler, then receives ANOTHER identical signal. Is the signal handler called recursively? ie will it execute the second handler before returning to, and completing the first handler?

I'm just looking to make sure all of the signals are delivered as synchronously as possible.

+2  A: 

Normal signals are not queued, and that's likely the cause of your problem.

If one signal arrives before the handler has been run for a past signal, they'll get merged and there's little you can do about that - you're probably better off using some other form of IPC to do this kind of synchronization.

You could use "realtime signals", which do get queued. You'd send signals with sigqueue() and "receive" them with sigwaitinfo() or establishing a signal handler setting the SA_SIGINFO flag in a struct sigaction

nos
Finally got it working... had another issue with sending to multiple children with sigqueue().I had the assumption that I could use this to send to all children as I could with kill: kill (0, SIGUSR1)It seems this cannot be done with sigqueue()! This page was quite useful:http://davmac.org/davpage/linux/rtsignals.html
mitch