views:

138

answers:

2

Briefly, a manual reset event is a synchronization construct which is either in the "signaled" or "nonsignaled" state. In the signaled state, any thread which calls a wait function on the event will not block and execution will continue unaffected. Any and all threads which calls a wait function on a nonsignaled object will block until the event enters the signaled state.

The the transition between the signaled and nonsignaled states occurs only as a result of explicit calls to functions such as SetEvent and ResetEvent.

I've built a synchronization mechanism on Windows which uses both these manual reset events and their auto-reset siblings. The auto-reset mechanism can be easily replicated with a semaphore, but I'm struggling to find an equivalent for the manual-reset variety.

In particular, while a condition variable with "notify all" functionality might appear similar at first glance, it has considerably different (perhaps non-functional) behavior when you consider the fact that it requires an associated mutex. First, before the thread can wait on a condvar, it must get the associated mutex. In addition to the cost of getting and releasing the mutex, this serializes unnecessarily all the threads which are about to wait. On wake, even though all threads are notified, only one thread will actually get the mutex at a time, incurring additional performance and concurrency penalties, since the mutex serves no purpose in this case.

The release case is especially poor on a multi-CPU system given that the simultaneous release of all waiters guarantees that the difference between a condvar and a Windows event will be observable - with an Event, at N threads will become runnable on an N CPU system, and can run in parallel, while with a condvar - even with an implementation that avoids the thundering herd - the threads can only leak out one at a time through the associated mutex.

Any pointers to a construct that better imitates the behavior of manual reset events would be greatly appreciated. The closest I can find is a barrier - this allows the unsynchronized approach and release of multiple threads to the barrier - but the barrier "breaks" based on waiting thread count rather than an explicit application call, which is what I need.

+2  A: 

It's Linux-only (I only mention it because you have a "linux" tag), but I think you could build something like this on the futex syscall. See Ulrich Drepper's paper on Futexes for the gory details.

Very roughly, I envisage something along the lines of

void inline gate_wait(volatile int *gate)
{
    if (*gate)
        while (futex(gate, FUTEX_WAIT, 1, 0, 0, 0) == EINTR)
            ;
}

int inline gate_open(volatile int *gate)
{
    *gate = 0;
    return futex(gate, FUTEX_WAKE, INT_MAX, 0, 0, 0);
}

void inline gate_close(volatile int *gate)
{
    *gate = 1;
}

If you do build this "gate" synchronisation primitive on top of futexes, it might be worth contributing it to Rusty Russell's userspace futex library.

caf
Thanks for this - the thought had crossed my find that Futexes (Futices?) might be general enough to be kernel-side half of a from-scratch implementation.I will check out Russell's library as well (what I'm after is a fast rwlock with specific behavior).
BeeOnRope
(and yeah, I tagged it Linux rather than UNIX 'cause that's where the new stuff in this area is happening anyway)
BeeOnRope
A: 

Would a readers-writers lock work? I suspect it would, but you'd have to check if it was efficient in the cases you're interested in.

See the rwlock stuff in http://www.opengroup.org/onlinepubs/000095399/basedefs/pthread.h.html

Any number of readers exclusive with only one writer, so to block other threads, take a write lock. To let them proceed, unlock. You'd have to see which case is more efficient: taking a read lock when there is already at least one read lock, or taking the first read lock. I'd guess the former, so after letting all the threads go by write-unlocking, you could take a read lock.

If your threads don't read-unlock when they're done, you will have to do something different, or at least mess with the implementation internals. You don't want to decrement something one count at a time until you can take the write lock if 2^31 read locks have accumulated.

Oh yeah, caf's idea addresses that: the gate is read_lock; read_unlock; so you don't build up read locks.

Peter Cordes
So something like: `gate_wait { read_lock; read_unlock }`, `gate_close { write_lock; }`, `gate_open { write_unlock; }`? That should work.
caf
Yeah, good call. I hadn't thought of when to read_unlock. Looks to me like it should work, too. gate_open could take a read_lock if that makes threads going through the gate more efficient. (since then the unlocker doesn't have to check if there were any writers waiting when the read_lock count drops to 0.)
Peter Cordes
Heh, an interestingly recursive solution since my original request was part of moving a high-performance rwlock implementation from Windows to UNIX platforms. Strictly speaking I think the solution you provided is not well-formed since the thread which does the write lock must also do the unlock in POSIX: "Results are undefined if the read-write lock rwlock is not held by the calling thread." Of course, many implementations allow this without complaining, but others may not, or even go out of their way to enforce it via an explicit check.
BeeOnRope
I was under the impression there was one gatekeeper thread which would block and unblock the other threads. You'd better have a look at futex, then, since that's linux's native lock interface that the posix stuff is built on.
Peter Cordes
The gatekeeper functions can be called in any thread - in my case they are in fact called by arbitrary threads, with no relationship between gate opener and gate closer.
BeeOnRope