I am trying to create a wrapper on Linux which controls how many concurrent executions of something are allowed at once. To do so, I am using a system wide counting semaphore. I create the semaphore, do a sem_wait()
, launch the child process and then do a sem_post()
when the child terminates. That is fine.
The problem is how to safely handle signals sent to this wrapper. If it doesn't catch signals, the command might terminate without doing a sem_post()
, causing the semaphore count to permanently decrease by one. So, I created a signal handler which does the sem_post()
. But still, there is a problem.
If the handler is attached before the sem_wait()
is performed, a signal could arrive before the sem_wait()
completes, causing a sem_post()
to occur without a sem_wait()
. The reverse is possible if I do the sem_wait()
before setting up the signal handler.
The obvious next step was to block signals during the setup of the handler and the sem_wait()
. This is pseudocode of what I have now:
void handler(int sig)
{
sem_post(sem);
exit(1);
}
...
sigprocmask(...); /* Block signals */
sigaction(...); /* Set signal handler */
sem_wait(sem);
sigprocmask(...); /* Unblock signals */
RunChild();
sem_post(sem);
exit(0);
The problem now is that the sem_wait()
can block and during that time, signals are blocked. A user attempting to kill the process may end up resorting to "kill -9" which is behaviour I don't want to encourage since I cannot handle that case no matter what. I could use sem_trywait()
for a small time and test sigpending()
but that impacts fairness because there is no longer a guarantee that the process waiting on the semaphore the longest will get to run next.
Is there a truly safe solution here which allows me to handle signals during semaphore acquisition? I am considering resorting to a "Do I have the semaphore" global and removing the signal blocking but that is not 100% safe since acquiring the semaphore and setting the global isn't atomic but might be better than blocking signals while waiting.