views:

205

answers:

4

I'm wondering what is the "best" way to make data thread-safe.

Specifically, I need to protect a linked-list across multiple threads -- one thread might try to read from it while another thread adds/removes data from it, or even frees the entire list. I've been reading about locks; they seem to be the most commonly used approach, but apparently they can be problematic (deadlocks). I've also read about atomic-operations as well as thread-local storage.

In your opinion, what would be my best course of action? What's the approach that most programmers use, and for what reason?

A: 

Always remember the most important rule of thread safety. Know all the critical sections of your code inside out. And by that, know them like your ABCs. Only if you can identify them at go once asked will you know which areas to operate your thread safety mechanisms on.

After that, remember the rules of thumb:

  • Look out for all your global variables / variables on the heap.
  • Make sure your subroutines are re-entrant.
  • Make sure access to shared data is serialized.
  • Make sure there are no indirect accesses through pointers.

(I'm sure others can add more.)

Suvesh Pratapa
+5  A: 
Alex Martelli
agree. I'd much rather move to a message passing based interface than support concurrent adds / removes AND traversal on a linked list.
Rick
You do still have to be careful about deadlocks in a pure message passing system though... if process A is waiting on message b from process B before it sends message a to process C -AND- process B will only send message b when it receives message c from process C -AND- process C only emits message c when it receives message a. Same basic problem as a deadlock only contingent on messaging dependencies instead of locked resources.
D.Shawley
In the Windows API, this is the difference between SendMessage (which is synchronous and which blocks the sender until the send has been received and processed) and PostMessage (which is asynchronous). Asynchronous message-passing isn't liable to deadlock (but might instead have might you wondering whether your post was ever actually received and processed).
ChrisW
Thanks for the idea!
beepboopbopbop
A: 

The "best" way, from a safety point of view, is to put a lock on the entire data structure, so that only one thread can touch it at a time.

Once you decide to lock less than the entire structure, presumably for performance reasons, the details of doing this are messy and differ for every data structure, and even variants of the same structure.

My suggestion is to

  1. Start with a global lock on your data structure. Profile your program to see if it's really a problem.

  2. If it is a problem, consider whether there's some other way to distribute the problem. Can you minimize the amount of data in the data structure in question, so that it need not be accessed so often or for so long? If it's a queuing system, for example, perhaps you can keep a local queue per thread, and only move things into or out of a global queue when a local queue becomes over- or under-loaded.

  3. Look at data structures designed to help reduce contention for the particular type of thing you're doing, and implement them carefully and precisely, erring on the side of safety. For the queuing example, work-stealing queues might be what you need.

Curt Sampson
This approach also assumes that an "entire structure" is self contained and that you never need to lock more than one object at a time. Normally the problem with global to the structure locks occurs when data in one structure points to data in another structure. Then deadlocks become common place.
jmucchiello
Indeed, if you need to lock multiple structures and you use multiple locks, rather than one lock to lock all of them at once, you are in the same position as if you use multiple locks to lock different parts of a data structure.
Curt Sampson
+2  A: 

You could consider an immutable collection. Much like how a string in .net has methods such as Replace, Insert, etc. It doesn't modify the string but instead creates a new one, a LinkedList collection can be designed to be immutable as well. In fact, a LinkedList is actually fairly simple to implement this way as compared to some other collection data structures.

Here's a link to a blog post discussing immutable collections and a link to some implementations in .NET.

http://blogs.msdn.com/jaredpar/archive/2009/04/06/immutable-vs-mutable-collection-performance.aspx

Josh Einstein
These sorts of data structures are often known as "persistent" data structures (and there is a `persistent-data-structures` tag on SO for discussions of this). These can often share memory cells with earlier versions of the data structures, reducing copying (and memory usage, if you keep the previous versions of the data structures).
Curt Sampson