views:

217

answers:

3

Hi,

Is there any way to perform POSIX shared synchronization objects cleanup especially on process crash? Locked POSIX semaphores unblock is most desired thing but automatically 'collected' queues / shared memory region would be nice too. Another thing to keep eye on is we can't in general use signal handlers because of SIGKILL which cannot be caught.

I see only one alternative: some external daemon which accepts subscriptions and 'keep-alive' requests working as watchdog so not having notifications about some object it could close / unlock object in accordance to registered policy.

Has anyone better alternative / proposition? I never worked seriously with POSIX shared objects before (sockets were enough for all my needs and are much more useful by my opinion) and I did not found any applicable article. I'd gladly use sockets here but can't because of historical reasons.

+2  A: 

The usual way is to work with signal handlers. Just catch the signals and call the cleanup functions.

But your watchdog daemon has some merits, too. It would surely make the system more simple to understand and manage. To make it more simple to administrate, your application should start the daemon when it's not running and the daemon should be able to clean up any residue from the last crash.

Aaron Digulla
The problem is we should stay even against SIGKILL. I corrected question based on your answer. What's related to daemon I'd prefer system service implementation so to be able to track dependencies.
Roman Nikitchenko
You shouldn't try to trap SIGKILL. If there is a bug in your code, it would be impossible to terminate it. So you don't want that. What you want is to iterate through all resources and check whether they are still in use. Most apps that use SHM offer a tool to clean up all shared resources. I suggest to automate this step.
Aaron Digulla
Not just "you shouldn't" -- you can't. http://linux.die.net/man/2/signal "The signals **SIGKILL** and **SIGSTOP** cannot be caught or ignored."
ephemient
Sorry, mixed that up with `SIGINT` (aka Ctrl-C) and `SIGTERM` (`kill` command).
Aaron Digulla
Yes, I understand I can't trap SIGKILL and SIGSTOP I just noted this solution should remain consistent even if someone killed some process with SIGKILL (for example). Though yes, signal handlers should include shared resources deallocation at least for cases we can do it. You are right.
Roman Nikitchenko
You must deallocate the resources *at program startup*, too! Sounds insane, I know. The reason is that it's possible to force your process out of the system before you have a chance to clean up. In this case, allocating the SHM during startup will fail (since they are still there). So you must do it twice: At startup, first clean up everything, then allocate it freshly. At shutdown, just deallocate.
Aaron Digulla
A: 

Well, here is another option to have daemon which checks periodically through all the shared memory regions, message queues and semaphores and see if creator / locker (for semaphores) exists. Possible actions and trackers are limited to what "ipcs" linux utility does.

Solution above has good advantage so we don't need to change anything in applications but also has number of drawbacks such as inflexible tracking policy. For example we can either allow shared memory region to exist even if creator is finished or not (globally). With subscriptions mechanics creator can setup tracking rules itself.

So in general I stay on subscription based daemon trackers in controlled (embedded) environments when applications can subscribe for tracking. For limited number of cases it is also possible to have tracking based on simply 'scanner' daemon which periodically scans for unowned shared objects (despite this solution has number of limitations).

I'd appreciate any comments / alternatives proposed.

Roman Nikitchenko
+2  A: 

Rather than using semaphores you could use file locking to co-oridinate your processes. The big advanatge of file locks being that they are released if the process terminates. You can map each semaphore onto a lock for a byte in a shared file and know that locks will get released on exit; in mosts version of unix the bytes you lock don't even have to exist. There is code for this in Marc Rochkind's book Advanced Unix Programming 1st edition, don't know if it's in the latest 2nd edition though.

Jackson
The book you recommended is really good. lockf() covers at least most dangerous part of original problem (semaphore deadlock)) prototype I built works fine.
Roman Nikitchenko
By the way lockf() could be used with third parameter set to 0 (all the file). This way it doesn't require any real byte present. At least on Linux.
Roman Nikitchenko