A: 

You want to use pthread_cond_broadcast() instead of pthread_cond_signal(). The former unblocks all threads waiting on a given condition.

Thomas Pornin
It will still fail if the thread enters the wait after your call.
Tronic
Tha's the point.. It only does anything if there's a thread blocking on the cond var. If there isn't then this thread won't know to exit.. This is the crux of the problem,
ScaryAardvark
@Tronic: the original poster should *still* use `pthread_cond_broadcast()` anyway, rather than the `pthread_cond_signal()` loop. My answer is incomplete (for the problem at hand, one needs to check `finished` before waiting instead of after), but not wrong.
Thomas Pornin
The code was somewhat tl;dr, but you are probably right. In a more complex case using cond_broadcast gets complicated and cancellation might be better.
Tronic
A: 

I have never used pthreads directly (I prefer Boost.Threads), but I think you should be calling pthread_cancel instead of pthread_cond_signal.

Tronic
No. I specifically do not want to call pthread_cancel which is why I'm trying to implement a nice gracefull shutdown. cancelling a thread is fraught with problems more severe than the one I'm trying to solve.
ScaryAardvark
You can set pthreads to cancel only at cancellation point, which allows for graceful exit. Thanks for the -1, though :(
Tronic
+4  A: 

Firstly, you have to change your predicate from

if ( msq.empty() ) {
  // no messages so wait for one.
  pthread_cond_wait( &cnd, &lock );
}

to

while ( msq.empty() ) {
  // no messages so wait for one.
  pthread_cond_wait( &cnd, &lock );
}

That's a pthreads thing, you have to guard yourself against spurious wakeups.

Now you can change that to

while ( msq.empty()  && !finished) {
  // no messages so wait for one.
  pthread_cond_wait( &cnd, &lock );
}

Since after that check, you already test whether finished is set and exits if so, all you have to do is signal all the threads.

So, in your teardown function, replace the loop with:

pthread_cond_broadcast(&cond);

That should ensure all threads wake up, and will see finished set to true and exit.

This is safe even if your threads are not stuck in pthread_cond_wait. If the threads are processing a message, they will not get the wakeup signal, however they will finish that processing, enter the loop again and see that finished == false and exit.

Another common pattern for this is to inject a poison message. A poison message is simply a special message your thread could recognise that would mean "STOP", you would place as many of those messages in your queue as you have threads.

nos
No. This is just a more elegant repeat of the same earlier answer. pthread_cond_broadast will only signal threads which are actually waiting on the cond var.. If any of the threads are busy processing then they WILL NOT WAKE UP when the eventually get to the wait call..
ScaryAardvark
If they are busy processing, they are already woken up and they will exit on the next loop because of finished being true. since your teardown() holds the mutex when you set finished, you're safe.
nos
+1 for the poison message approach. I often use that approach and it has proven to be the best solution in many cases.
Tronic
It's the signalling all threads that forms the basis of the problem. I can't guarantee that all threads are "waiing" and are in a signallable state. I do like the poison message approach. Presumably you have to set "finished" before sending your poison message.
ScaryAardvark
I would mark this as the accepted answer were you to change your post so that it doesn't indicate that pthread_cond_broadcast wakes up all the threads..
ScaryAardvark
As mentioned, the threads don't need to be in a waiting state when you signal them.
nos
You don't need a finished flag at all if you use a poison message approach.
Tronic
You should keep the lock until you finish signalling, my friendly man page says `Unlocking the mutex and suspending on the condition variable is done atomically. Thus, if all threads always acquire the mutex before signaling the condition, this guarantees that the condition cannot be signaled (and thus ignored) between the time a thread locks the mutex and the time it waits on the condition variable.`see here for reference http://manpage.b0red.de/3thr+pthread_cond_signal
Hasturkun
A: 

I guess you should be unlocking the mutex after the call to pthread_cond_signal. Also, please check the condition of "finished" before you enter into conditional wait after acquiring the mutex. Hope this helps!

Jay