views:

130

answers:

5

I've got a List collection and I want to iterate over it in a multi threaded app. I need to protect it every time I iterate it since it could be changed and I don't want "collection was modified" exceptions when I do a foreach.

What is the correct way to do this?

  1. Use lock every time I access or loop. I'm rather terrified of deadlocks. Maybe I'm just paranoid of using lock and shouldn't be. What do I need to know if I go this route to avoid deadlocks? Is lock fairly efficient?

  2. Use List<>.ToArray() to copy to an array each time I do a foreach. This causes a performance hit but is easy to do. I'm worried about memory thrashing as well as the time to copy it. Just seems excessive. Is it thread safe to use ToArray?

  3. Don't use foreach and use for loops instead. Wouldn't I need to do a length check every time I did this to make sure the list didn't shrink? That seems annoying.

+2  A: 

Use lock() unless you have another reason to make copies. Deadlocks can only occur if you are requesting multiple locks in different orders, for example:

Thread 1:

lock(A) {
  // .. stuff
  // Next lock request can potentially deadlock with 2
  lock(B) {
    // ... more stuff
  }
}

Thread 2:

lock(B) {
  // Different stuff
  // next lock request can potentially deadlock with 1
  lock(A) {
    // More crap
  }
}

Here thread 1 and thread 2 have the potential of causing a deadlock since Thread 1 may be holding A while Thread 2 is holding B and neither can continue until the other releases its lock.

If you must take multiple locks, always do it in the same order. If you're only taking one lock, then you won't cause a deadlock ... unless you hold a lock while waiting for user input, but that's not technically a deadlock and leads to another point: never hold a lock for any longer than you absolutely must.

Donnie
+2  A: 

In general, collections are not thread safe for performance reasons, except of Hash Table. You have to use IsSynchronized and SyncRoot to make them thread safe. See here and here

Example from msdn

ICollection myCollection = someCollection;
lock(myCollection.SyncRoot)
{
    foreach (object item in myCollection)
    {
        // Insert your code here.
    }
}

Edit: If you are using .net 4.0, you can use concurrent collections

ram
+5  A: 

If your List data is mostly read-only, you can allow multiple threads to safely access it simultaneously using a ReaderWriterLockSlim

You can find an implementation of a Thread-Safe dictionary here to get you started.

I also wanted to mention that if you are using .Net 4.0 the BlockingCollection class implements this functionality automatically. I wish I would have known about this a few months ago!

Laramie
ReaderWriterLockSlim is an excellent choice. It will allow multiple threads to enumerate the collection concurrently, unlike a simple lock which will only allow one thread. It also doesn't incur the overhead involved with ToArray().
Jack Leitch
+4  A: 

There's little reason to be afraid of deadlocks, they are easy to detect. Your program stops running, dead giveaway. What you really should be terrified of is threading races, the kind of bug you'll get when you don't lock when you should. Very hard to diagnose.

  1. Using lock is fine, just make sure you use the exact same locking object in any code that touches that list. Like the code that adds or removes items from that list. If that code runs on the same thread that iterates the list then you don't need a lock. Generally, the only chance for deadlock here is if you have code that relies on the thread state, like Thread.Join(), while it is also holding that locking object. Which ought to be rare.

  2. Yes, iterating a copy of the list is always thread-safe, as long as you use a lock around the ToArray() method. Note that you still need the lock, no structural improvement. The advantage is that you'll hold the lock for a short amount of time, improving concurrency in your program. The disadvantages are its O(n) storage requirements, only having a safe list but not protecting the elements in the list and the tricky problem of always having a stale view of the list content. Especially the last problem is subtle and hard to analyze. If you cannot reason out the side-effects then you probably shouldn't consider this.

  3. Do make sure to treat the ability of foreach to detect a race as a gift, not a problem. Yes, an explicit for(;;) loop is not going to throw the exception, it is just going to malfunction. Like iterating the same item twice or skipping an item completely. You could avoid having the re-check the number of items by iterating it backwards. As long as other thread(s) are only calling Add() and not Remove() that would behave similarly to ToArray(), you'll get the stale view. Not that this will work in practice, indexing the list is not thread-safe either. List<> will reallocate its internal array if necessary. This just won't work and malfunction in unpredictable ways.

There are two points of view here. You can be terrified and follow common wisdom, you'll get a program that works but might not be optimal. That's wise and keeps the boss happy. Or you can experiment and find out for yourself how skewing the rules gets you in trouble. Which will make you happy, you'll be a much better programmer. But your productivity is going to suffer. I don't know what your schedule looks like.

Hans Passant
+1: Best answer.
Brian Gideon
+2  A: 

You could also consider using an immutable data structure - treat your list like a value type.

If it's possible, using Immutable objects can be an excellent choice for multi-threaded programming because they remove all the clunky locking semantics. Essentially any operations that would change the state of the object creates an entirely new object.

e.g. I whipped up the following to demonstrate the idea. I'll apologize that it's by no means reference code, and it started to get a bit long.

public class ImmutableWidgetList : IEnumerable<Widget>
{
    private List<Widget> _widgets;  // we never modify the list

    // creates an empty list
    public ImmutableWidgetList()
    {
        _widgets = new List<Widget>();
    }

    // creates a list from an enumerator
    public ImmutableWidgetList(IEnumerable<Widget> widgetList)
    {
        _widgets = new List<Widget>(widgetList);
    }

    // add a single item
    public ImmutableWidgetList Add(Widget widget)
    {
        List<Widget> newList = new List<Widget>(_widgets);

        ImmutableWidgetList result = new ImmutableWidgetList();
        result._widgets = newList;
        return result;
    }

    // add a range of items.
    public ImmutableWidgetList AddRange(IEnumerable<Widget> widgets)
    {
        List<Widget> newList = new List<Widget>(_widgets);
        newList.AddRange(widgets);

        ImmutableWidgetList result = new ImmutableWidgetList();
        result._widgets = newList;
        return result;
    }

    // implement IEnumerable<Widget>
    IEnumerator<Widget> IEnumerable<Widget>.GetEnumerator()
    {
        return _widgets.GetEnumerator();
    }


    // implement IEnumerable
    System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()
    {
        return _widgets.GetEnumerator();
    }
}
  • I included IEnumerable<T> to allow for a foreach implementation.
  • You mentioned you worried about the space/time performance of creating new lists, so perhaps this won't work for you.
  • you might also want to implement IList<T>
Robert Paulson
+1 the best for me.
onof