views:

1454

answers:

3

I understand what livelock is but I was wondering if anyone had a good code-based example of it? And by code-based, I do NOT mean "two people trying to get past each other in a corridor". If I read that again, I'll lose my lunch.

+7  A: 

Flippant comments aside, one example which is known to come up is in code which tries to detect and handle deadlock situations. If two threads detect a deadlock, and try to "step aside" for each other, without care they will end up being stuck in a loop always "stepping aside" and never managing to move forwards.

By "step aside" I mean that they would release the lock and attempt to let the other one acquire it. We might imagine the situation with two threads doing this (pseudocode):

// thread 1
getLocks12(lock1, lock2)
{
  lock1.lock();
  while (lock2.locked())
  {
    // attempt to step aside for the other thread
    lock1.unlock();
    wait();
    lock1.lock();
  }
  lock2.lock();
}

// thread 2
getLocks21(lock1, lock2)
{
  lock2.lock();
  while (lock1.locked())
  {
    // attempt to step aside for the other thread
    lock2.unlock();
    wait();
    lock2.lock();
  }
  lock2.lock();
}

Race conditions aside, what we have here is a situation where both threads, if they enter at the same time will end up running in the inner loop without proceeding. Obviously this is a simplified example. A naiive fix would be to put some kind of randomness in the amount of time the threads would wait.

The proper fix is to always respect the lock heirarchy. Pick an order in which you acquire the locks and stick to that. For example if both threads always acquire lock1 before lock2, then there is no possibility of deadlock.

1800 INFORMATION
Yeah, I understand that. I'm looking for an actual code example of such. The question is what does "step aside" mean and how does it produce such a scenario.
Alex Miller
A: 

One example here might be using a timed tryLock to obtain more than one lock and if you can't obtain them all, back off and try again.

boolean tryLockAll(Collection<Lock> locks) {
  boolean grabbedAllLocks = false;
  for(int i=0; i<locks.size(); i++) {
    Lock lock = locks.get(i);
    if(!lock.tryLock(5, TimeUnit.SECONDS)) {
      grabbedAllLocks = false;

      // undo the locks I already took in reverse order
      for(int j=i-1; j >= 0; j--) {
        lock.unlock();
      }
    }
  }
}

I could imagine such code would be problematic as you have lots of threads colliding and waiting to obtain a set of locks. But I'm not sure this is very compelling to me as a simple example.

Alex Miller
A: 

A real (albeit without exact code) example is two competing processes live locking in an attempt to correct for a SQL server deadlock, with each process using the same wait-retry algorithm for retrying. While it's the luck of timing, I have seen this happen on separate machines with similar performance characteristics in response to a message added to an EMS topic (e.g. saving an update of a single object graph more than once), and not being able to control the lock order.

A good solution in this case would be to have competing consumers (prevent duplicate processing as high up in the chain as possible by partitioning the work on unrelated objects).

A less desirable (ok, dirty-hack) solution is to break the timing bad luck (kind of force differences in processing) in advance or break it after deadlock by using different algorithms or some element of randomness. This could still have issues because its possible the lock taking order is "sticky" for each process, and this takes a certain minimum of time not accounted for in the wait-retry.

Yet another solution (at least for SQL Server) is to try a different isolation level (e.g. snapshot).

Kit