views:

144

answers:

4

In Release It!, Michael Nygard reasons that many catastrophic system failures are often caused by a chain of things going wrong. For example, two threads deadlock. There's now two less threads in the thread pool, so load increases on the other threads increasing their likelihood of deadlocks. Suddenly, the server does not respond at all, because the thread pool is exhausted, which causes the load balancer to divert traffic to the other servers (who are all running the same code), which increases their likelihood of deadlocks. Suddenly the whole farm is offline.

Most RDBMS servers detect deadlocks and decide a "loser" (one transaction is aborted, the other can continue). By contrast, in C#, the lock statement will wait indefinitely for a lock to be acquired.

You can however call Monitor.TryEnter(lockObject, TimeSpan) to request a lock or timeout. It returns false if the timeout expires and the lock could not be acquired. Some have wrapped this in using statements to keep a nice syntax.

So my question is, do you always acquire locks using timeouts, or not? And what issues do timeouts create versus a deadlock scenario?

+4  A: 

I normally do use timeouts. The biggest issue here is that if the timeout is reached, the requesting operation is going to be aborted. This is obviously preferrable to deadlocking. There's a bigger problem there though: if the operation is critical and you start aborting because something else is deadlocked, if your design is not sound, you could end up causing the farm-going-down issue you described via this method (though softer: your app will no longer work, but you haven't lost control).

The main difference is that you actually have control here, whereas if threads start deadlocking, there's nothing you can do within your code to fix the problem once the failure has started.

DannySmurf
+1  A: 

As a general rule I never create locks with infinite timeouts. It simply leads to hard-to-find and hard-to-debug deadlocking. It doesn't take any added work to add a timeout check, even if it just throws an exception, and it notifies you almost immediately if you have a deadlock. More importantly it helps you find bottlenecks caused bu locks that may not cause full deadlocks.

ctacke
A: 

It is possible to prove deadlocks cannot occur. It is also possible to prove at runtime that deadlocks can occur even without depending on race conditions (see Linux's lockdep for example).

Therefore, always trying a lock doesn't make sense. It depends on circumstances.

Eduard - Gabriel Munteanu
+1  A: 

Can you gracefully recover from the situation where the lock attempt has timed out? If you can then by all means use timeouts. If you can't then there's not a lot of point in timing out. What are you going to do when you resume execution? At best you can exit with an error message. Timeouts have their uses, but they can also be used as a way to hide subtle threading bugs, so I think the best advice is use with caution.

Stu Mackellar
I guess the question becomes, if all you can do is show an error message, is that better than waiting indefinitely? Deadlocks don't give you a chance to rollback, and errors are probably a (slightly) better user experience than a frozen application.
Paul Stovell