views:

452

answers:

3

I'm using named System V semaphores to lock a file across all my apps on OSX and linux. Not prettiest of APIs by any definition.

It seems to work, but I can't quite figure out how to properly destroy the semaphore after everybody is done with it.

General logic is like this:

Creating:

[1] Thread or process tries to open a semaphore set with key_t created for the file by ftok(). Set contains 2 semaphores. [2] If semaphore set doesn't exist, it is created with 666 permissions. [3] "Lock" (one of the semaphores) set into released state (value 1). [4] "Reference count" (another semaphore in the same set) is incremented.

Locking/unlocking:

To lock [5], a thread decrements the value of "Lock" semaphore by 1 (with undo), thus waiting if it is already zero. To unlock [6], thread increments it by one, thus allowing somebody else to lock it.

Destroying:

[7] "Reference count" semaphore is attempted to be decremented (with IPC_NOWAIT flag). [8] Its value is checked to be 0, and if it is [9] semaphore set is destroyed.

(There is also a layer of logic based on thread local storage to make the lock recursive within one thread.)

The questions are:

  • How do I synchronize steps [1] and [2]? (if semaphore set doesn't exist, but while we were counting stars, it was created by somebody else so now creation will fail too)
  • How do I synchronize steps [4] with [8] so that [9] does not kill me prematurely?
  • Are there any other race conditions?

PS: While POSIX semaphores have much nicer API, I don't think I can survive sem_inlink() behavior as described here:

Calls to sem_open() to re-create or re-connect to the semaphore refer to a new semaphore after sem_unlink() is called.

So I will have no way to release them...

+1  A: 

Several approaches:

First, if your goal is to lock a file, then use a file-locking call like flock(2) or fcntl(2)+F_SETLK, not a semaphore. IMHO, this is the best approach.

Second, keep a sem around forever. You're right, the proposal is racy, and your phrasing suggests that new sem clients might appear at any time. You'd need a separate synchronization mechanism, like a separate, long-lived sem, to control the creation/destruction of the sem you actually care about. You could get exotic and combine this with a dedicated "wait-for-zero" (mysembuf.sem_op := 0) destroyer, watching the refcount sem and ready to IPC_RMID. Yuck. Better to just have a single, persistent, binary semaphore, without user-supplied reference counting.

Third, use POSIX named sems. Ignore sem_unlink() and instead simply sem_close() when done (after sem_post()ing to unlock, of course!). This is conceptually like the previous approach -- a small synchronization primitive persists -- but, as you say, a simpler API. Also you don't have to deal with SysV semaphores' fatal flaw.

pilcrow
I'm not sure about fatal flaw: creation of semaphore is atomic (I hope?). Next process will actually get it, or at worst fail a call to create (that should simply try getting it again in a loop). Then untill creating process releases it (sets to 1), everybody else would get already existing sem and if trying to "lock" (with -1) they will wait. Creating process loses control, but that's ok, they all are equal.
Eugene
(Actually that answers one of my questions, I just didn't want the loop :))
Eugene
As for keeping a sem forever, on linux maximum is usually very small, something like 128 sets. I need to lock only few files, but my unit test reaches it quite fast -- that's the main reason I want to kill them when not used.
Eugene
As for file locking, I looked around a bit and it seems threads of one process can't lock out each other, meaning I will have to keep some kind of global state and protect it with pthread syncronization primitives...
Eugene
@Eugene, re: *"fatal flaw"*, that's the gap between `IPC_CREAT` and `SETVAL`. Thread A creates a sem, B attempts to use it before A can initialize it. Re: unit test sem exhaustion, seeing code would help. Unclear why, for instance, `setup()` and `teardown()` don't handle a single sem set for you...
pilcrow
Well, sem is zeroed out on creation, which in my case is "locked" state, so it already is in state I want it in. Other clients would wait for creator to "release". As for unit tests it was creating and deleting files to lock, and ftok() works on node level, thus making new sems each run :)
Eugene
A: 

Here is what I ended up doing (it is a matter of honor at this point, I won't go away untill I have correct code regardless if it is needed for the task at hand :)).

Creating

Try to open [1] existing sem set with 3 sems, if fails try [2] to create one. If fails to create because somebody created already, go back to [1]. This loop will eventually exit, either with opened or created sem, or because of an error I can't handle, in which case I take the ball and go home. (I also have a limit of N iterations, just in case :)).

One of the 3 sems is payload, another one is reference count and third one is a lock for the ref count. After [2] lock is initialized to 0, the locked state.

Retaining

If sem set was created by [2], all 3 sems are semoped [3] from 0 to 1. Payload is released, ref count is 1, lock is released (no undo). If it was opened by [1], lock is acquired [4] (-1), ref count is incremented (+1) and lock is released (+1). This will block if lock is zero at the moment. If this semop fails due to sem set being destroyed at [6] while we are waiting, retaining fails and we go all the way back to [1]. This loop has limited number of iterations too.

Releasing

Lock is acquired [5] (-1 with wait), ref count is decremented (-1 with no wait). If this succeeds then if ref count is now zero, the sem set is destroyed. Otherwise [6] lock is released (+1). If getting lock failed because sem set was destroyed -- doing nothing.

Between retaining and releasing, payload is used as usual.

Besides complexity and overhead of 2 semaphores per set, there is only one problem (now I see the fatal flaw :)) -- when creator crashes between [2] and [3]. This will hang all clients dead. I can use timed waits on linux and kill orphaned semaphores, but OSX, being usual idiotic self, doesn't have timed operations, so I'm kinda screwed...

*...goes away to write own kernel or something...*

Eugene
A: 

very interesting - I am looking for similar information on :release

If process A acquiries and dies, is there a way for kernel to notify all other processes (in read/write) queue via signal or such ? What happens to processes in the queue if (a) process A never comes back or (b) just restarts ?

Kindly email pointers and share your knowledge..

Thanks !

Sk

ccmail111