views:

672

answers:

2

Hi,

I'm doing IPC on Linux using boost::interprocess::shared_memory_object as per the reference (anonymous mutex example).

There's a server process, which creates the shared_memory_object and writes to it, while holding an interprocess_mutex wrapped in a scoped_lock; and a client process which prints whatever the other one has written - in this case, it's an int.

I ran into a problem: if the server sleeps while holding the mutex, the client process is never able to aquire it and waits forever.

Buggy server loop:

using namespace boost::interprocess;
int n = 0;
while (1) {
    std::cerr << "acquiring mutex... ";
    {
        // "data" is a struct on the shared mem. and contains a mutex and an int
        scoped_lock<interprocess_mutex> lock(data->mutex);
        data->a = n++;
        std::cerr << n << std::endl;
        sleep(1);
    } // if this bracket is placed before "sleep", everything works
}

Server output:

acquiring mutex... 1
acquiring mutex... 2
acquiring mutex... 3
acquiring mutex... 4

Client loop:

while(1) {
   std::cerr << "acquiring mutex... ";
   {
      scoped_lock<interprocess_mutex> lock(data->mutex);
      std::cerr << data->a << std::endl;
   }
   sleep(1);
}

Client output (waits forever):

acquiring mutex...

The thing is, if I move the bracket to the line before the sleep call, everything works. Why? I didn't think sleeping with a locked mutex would cause the mutex to be eternally locked.

The only theory I have is that when the kernel wakes up the server process, the scope ends and the mutex is released, but the waiting process isn't given a chance to run. The server then re-acquires the lock... But that doesn't seem to make a lot of sense.

Thanks!

A: 

sleeping while holding a mutex is wrong. Mutex protects some data (i.e. data->a) and scope should be minimized around read/write of that data.

Andrey
Yes, I know that, thanks. However, shouldn't it work anyway?
Pedro d'Aquino
I thought it actually should; seems like Steve did a better job understanding the real problem.
Andrey
+6  A: 

Your theory is correct.

If you look at the bottom of the anonymous mutex example in the reference you linked, you'll see

As we can see, a mutex is useful to protect data but not to notify to another process an event.

Releasing the mutex doesn't notify anyone else that might be waiting on it, and since your process just woke up, it almost certainly has plenty of its scheduling quantum left to do more work. It will loop around and re-acquire the mutex before it sleeps again, which is the first opportunity the client has to acquire the mutex itself.

Moving the server sleep() outside of the scope means it goes to sleep while the mutex is free, giving the client a chance to run and acquire the mutex for itself.

Try calling sched_yield() (Linux only) if you want to give up the processor, but still sleep within your scope. sleep(0) may also work.

Steve Madsen
Boost does have its own call to give up the thread (like linux's schedu_yield): boost::this_thread::yield()
teeks99
Thanks. `sleep(0)` does work, which confirms that theory. I was under the impression that the act of releasing a mutex would at the very least cause a system call and thus a rescheduling, but I was wrong.
Pedro d'Aquino