views:

607

answers:

6

If I have something like this (pseudocode):

class A
{
    List<SomeClass> list;

    private void clearList()
    {
        list = new List<SomeClass>();
    }

    private void addElement()
    {
        list.Add(new SomeClass(...));
    }
}

is it possible that I run into multithreading problems (or any kind of unexpected behavior) when both functions are executed in parallel?

The use case is a list of errors, which could be cleared at any time (by simply assigning a new, empty list).

EDIT: My assumptions are

  • only one thread adds elements
  • forgotten elements are okay (i.e. race condition between clearing and adding a new element), as long as the clear operation succeeds without problems
  • .NET 2.0
+5  A: 

There are two possibilities for problems here:

  • Newly added items could end up being forgotten immediately, because you clear out and create a new list. Is that an issue? Basically, if AddElement and ClearList are called at the same time, you have a race condition: either the element will end up in the new list, or in the old (forgotten) one.
  • List<T> isn't safe for multi-threaded mutation, so if two different threads call AddElement at the same time the results aren't guaranteed

Given that you're accessing a shared resource, I would personally hold a lock while accessing it. You'll still need to consider the possibility of clearing the list immediately before/after adding an item though.

EDIT: My comment about it being okay if you're only adding from one thread was already somewhat dubious, for two reasons:

  • It's possible (I think!) that you could end up trying to add to a List<T> which hadn't been fully constructed yet. I'm not sure, and the .NET 2.0 memory model (as opposed to the one in the ECMA specification) may be strong enough to avoid that, but it's tricky to say.
  • It's possible that the adding thread wouldn't "see" the change to the list variable immediately, and still add to the old list. Indeed, without any synchronization, it could see the old value forever

When you add "iterating in the GUI" into the mix it gets really tricky - because you can't change the list while you're iterating. The simplest solution to this is probably to provide a method which returns a copy of the list, and the UI can safely iterate over that:

class A
{
    private List<SomeClass> list;
    private readonly object listLock = new object();

    private void ClearList()
    {
        lock (listLock)
        {
            list = new List<SomeClass>();
        }
    }

    private void AddElement()
    {
        lock (listLock)
        {
            list.Add(new SomeClass(...));
        }
    }

    private List<SomeClass> CopyList()
    {
        lock (listLock)
        {
            return new List<SomeClass>(list);
        }
    }

}
Jon Skeet
Only one thread is adding elements, so the second point isn't a problem for me. And I was aware of the race condition, but that's not really important for me - but I'm not calling `Clear` at all, instead I'm creating a new list. Would it be a problem to call `Clear` and `Add` at the same time (you said `List` is not thread-safe)?
AndiDog
Sorry, I meant AddElement and ClearList. Just creating a new list should be okay, and if you're only adding from a single thread that should be all right. Don't forget that there can be further complications when you're reading from the list too - would that only happen in the same thread that's doing the Add?
Jon Skeet
No, another GUI thread should be able to iterate over it. That will be implemented in a few weeks. Can this cause trouble if the GUI uses `foreach(Element e in instanceOfC.list) { ... }` - if I assign a new `List` during this iteration, the GUI will still work on the old list, right?
AndiDog
Sorry I forgot about another couple of concerns - will update when I'm back at home in a few hours.
Jon Skeet
Thanks for the edit. I already thought about copying the list for use by the GUI. And another thing: "Indeed, without any synchronization, it could see the old value forever" - do you have a reference for that? Such caching would be awful.
AndiDog
@AndiDog: It's more of a theoretical problem than a practical one in *most* cases - but basically look at the memory model and see what's actually *guaranteed*. Unless you have appropriate memory barriers, your reads can effectively "move back" (e.g. via caching) forever.
Jon Skeet
+2  A: 

Yes - it is possible,. In fact, if these are genuinely being called at the same time, it is highly likely.

In addition, it is also likely to cause problems if two seperate calls to addElement occur at the same time.

For this sort of multithreading, you really need some sort of mutually exclusive lock around the list itself, so only one operation on the underlying list can be called at a time.

A crude locking strategy around this would help. Something like:

class A
{
    static object myLock = new object()
    List<SomeClass> list;

    private void clearList()
    {
        lock(myLock)
        {
          list = new List<SomeClass>();
        }

    }

    private void addElement()
    {
        lock(myLock)
        {
          list.Add(new SomeClass(...));
        }
    }
}
Rob Levine
A: 

It is properly not a good thing to just make a new List when you want to clear it.

I assume you also assigned list in the constructor so you don't run into a null-pointer exception.

If you clear and elements is added, they can be added to the old list which I assume is fine? BUT if two elements is added at the same time, you can run into problems.

Look into .Net 4 new collections to handle multithreading tasks :)

ADDITION: Look into the namespace System.Collections.Concurrent if you use .Net 4. There you will find: System.Collections.Concurrent.ConcurrentBag<T> and many other nice collections :)

You should also note that lock can significantly poll down performance if you dont watch out.

lasseespeholt
Sorry I didn't mention I am using .NET 2.0 - so are there builtin thread-safe collections in .NET 2.0? Or do I have to use `lock`?
AndiDog
Hhm, instead of stating lock etc. explicit I would make my own class ala ThreadSafeList where the methods is locked. So you don't have to write lock multiple times (and maybe forget).
lasseespeholt
I just found this: http://blogs.msdn.com/jaredpar/archive/2009/02/11/why-are-thread-safe-collections-so-hard.aspx but I have not looked at the code so check it before using it :)
lasseespeholt
+1  A: 

If you use one instance of this class in multiple threads, yes. you will run into problems. All collections in the .Net framework (version 3.5 and lower) are NOT thread-safe. Specially when you start changing the collection while another thread is itterating over it.

Use locking and give out ´copies of´ collections in multithreaded environments, or if you can use .Net 4.0, use the new concurrent collections.

Marvin Smit
A: 

It is clear from the edits to your question that you do not really care about the usual culprits here - there are really no simultaneous calls to the methods of the same object.

Essentially you are asking if it is ok to assign the reference to your list while it is being accessed from a parallel thread.

As far as I understand it still can cause trouble. It all depends on how reference assignment is implemented on the hardware level. To be more precise whether this operation is atomic or not.

I think that as slim as it is there is still a chance, especially in multiprocessor environments, that the process will get corrupted reference because it was only partially updated when it was accessing it.

mfeingold
I don't believe that hardware differences / reference assignment could cause trouble here. As far as I understand the C# specs (Section 5.5: http://msdn.microsoft.com/en-us/library/aa691278%28VS.71%29.aspx), reference assignment shall be atomic.
AndiDog
+1  A: 

Collections in .NET (up to 3.5) are not thread-safe or non-blocking (parallel execution). You should implement yours by deriving from IList and use a ReaderWriterLockSlim for performing every action. For example, your Add method should look like this:

    public void Add(T item)
    {
        _readerWriterLockSlim.EnterWriteLock();
        try { _actualList.Add(item); }
        finally { _readerWriterLockSlim.ExitWriteLock(); }
    }

You must be aware of some concurrency tricks here. For example you must have a GetEnumerator which returns a new instance as an IList; not the actual list. Otherwise you will run into problems; which should look like:

    public IEnumerator<T> GetEnumerator()
    {
        List<T> localList;

        _lock.EnterReadLock();
        try { localList= new List<T>(_actualList); }
        finally { _lock.ExitReadLock(); }

        foreach (T item in localList) yield return item;
    }

and:

    System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()
    {
        return ((IEnumerable<T>)this).GetEnumerator();
    }

Note: When implementing thread-safe or parallel collections (and in fact every other class) DO NOT DERIVE FROM THE CLASS, BUT INTERFACE! Because there will be always problems related to internal structure of that class or some methods that are not virtual and you have to hide them and so on. If you have to do this, do it very carefully!

Kaveh Shahbazian