views:

122

answers:

2

Hi, people. For a academic exercise i have to implement a program in c for nix platform which is to synchronize multiple processes through signal handling, using only signal,pause,kill and fork basic functions. I searched google and have not found any clear example: I hope the wisdom of one of you will light my way. Thanks!

+2  A: 

One important thing to be aware of is that Linux (at least, and possibly many other Unices) can collapse multiple signals of the same type into a single instance. So if you send a process one signal of value x, that process is guaranteed to receive it; but if you send 2 or more signals of value x, the process is only guaranteed to receive at least one of those signals.

Also, signals are not guaranteed to be received in the order they are sent.

(Why? Under the hood, Linux maintains a bitmask for each process recording which outstanding signals have been sent. Whenever a process is woken up by the scheduler, signal handlers for all outstanding signals are run, in some arbitrary order.)

What all this means is that signals are generally inappropriate for synchronising processes. They only work reliably when time intervals between signals are large with respect to the interval between wake-up times of the receiving process. And if a process spends a lot of time blocked, wake-up events can be arbitrarily far apart.

Conclusion: don't using signals for IPC.

j_random_hacker
Maybe the point of this homework is to show all of that.
mouviciel
jum the lesson is more complete with this point thanks
jneira
+2  A: 

pause doesn't return until a signal is received. The basic design is thus:

  • fork to create the necessary workers
  • catch SIGINT in each worker. The handler sets a flag meaning the process should exit after finishing it's current job.
  • each process does work while it can, then pauses. Repeat unless SIGINT is received (test before & after pause).
  • when one process has work available for another process, it signals the other process with SIGCONT
  • when there is no more work for a process, signal it with SIGINT.

This doesn't quite include synchronized access to shared data. To do that, you could add an additional rule:

  • when a process signals another that work is available, it should pause

Of course, this rather defeats the purpose of concurrent programming.

Since most system calls are interrupted by signals (causing them to return -1, with errno set to EINTR), you'll have to handle this contingency, repeating each affected system call until it's successful. For example:

while ((readCount = read(...)) < 0 && errno == EINTR) {}
outis
i get the exercise done with a similar design, thanks!
jneira
Although this seems like a good idea, it can fail if 2 processes both send some other process a SIGCONT within a short time, as that process may only receive a single SIGCONT and thus "lose" the 2nd job (please see my post for details). It's possible to get around this by giving each worker a "queue" of jobs, and changing the interpretation of SIGCONT from "A job is ready" to "At least one job is ready", but that requires the ability to atomically update the queue... Possibly implementable using filesystem renames in a loop?
j_random_hacker
@j: that's the function of `SIGCONT` in my outline: break the `pause`. Note "each process does work while it can." `pause` and `SIGCONT` are there solely to prevent busywait.
outis
@outis, I see what you mean. +1. Still I'd like to see you describe the queue+SIGCONT semantics more precisely in your article, since although it reads like a subtle difference it's actually the difference between a working system and one that will work "most of the time," but occasionally fail.
j_random_hacker