tags:

views:

1380

answers:

6

I have a situation in C# where I have a list of simple types. This list can be accessed by multiple threads: entries can be added or removed, and the existence of an entry can be checked. I have encapsulated the list in an object exposing just those three operations so far.

I have a few cases to handle (not exactly the same as the methods I just mentioned).
1. A thread can just check for the existence of an entry. (simple)
2. A thread can check for the existence of an entry, and if it doesn't exist, add it.
3. A thread needs to check whether an entry exists, and if it does, wait until it is removed.
4. A combination of 2 and 3, where a thread checks for the existence of an entry, if it does exist, it must wait until it is removed before it can then add it itself.

The whole idea is that the existence of an entry signifies a lock. If an entry exists, the object it identifies cannot be changed and code cannot proceed because it is being modified elsewhere.

These may seem like simple novice situations but I'm refreshing myself on concurrency issues and it's making me a bit paranoid, and I'm also not as familiar with C#'s concurrency mechanisms.

What would be the best way to handle this? Am I totally off? Should check and add (test and set?) be combined into a fourth atomic operation? Would I simply be adding lock blocks to my methods where the list is accessed?

Also, is it possible to unit test this kind of thing (not the simple operations, the concurrency situations)?

A: 

The parent post presents a number of race conditions.

A much simpler (and safer) solution would be to use the lock statement. This will ensure the object that is shared between threads is locked while an individual thread works on it.

lock (SomeObjectInstance)
{
   //Do some stuff with SomeObjectInstance
}

Additionally you can use the Interlocked class to perform simple, thread-safe operations like increment, decrement, add, and subtract on numeric value types.

Interlocked.Increment( ref x );
William Edmondson
'above' is not a very good way to designate an answer here. Better mention a name.
Henk Holterman
+3  A: 

Unit testing will certainly be hard.

This can all be done reasonably simply with the "native" concurrency mechanisms in .NET: lock statements and Monitor.Wait/Monitor.PulseAll. Unless you have a separate monitor per item though, you're going to need to wake all the threads up whenever anything is removed - otherwise you won't be able to tell the "right" thread to wake up.

If it really is just a set of items, you might want to use HashSet<T> instead of List<T> to represent the collection, by the way - nothing you've mentioned is to do with ordering.

Sample code, assuming that a set is okay for you:

using System;
using System.Collections.Generic;
using System.Threading;

public class LockCollection<T>
{
    private readonly HashSet<T> items = new HashSet<T>();
    private readonly object padlock = new object();

    public bool Contains(T item)
    {
        lock (padlock)
        {
            return items.Contains(item);
        }
    }

    public bool Add(T item)
    {
        lock (padlock)
        {
            // HashSet<T>.Add does what you want already :)
            // Note that it will return true if the item
            // *was* added (i.e. !Contains(item))
            return items.Add(item);
        }
    }

    public void WaitForNonExistence(T item)
    {
        lock (padlock)
        {
            while (items.Contains(item))
            {
                Monitor.Wait(padlock);
            }
        }
    }

    public void WaitForAndAdd(T item)
    {
        lock (padlock)
        {
            WaitForNonExistence(item);
            items.Add(item);
        }
    }

    public void Remove(T item)
    {
        lock (padlock)
        {
            if (items.Remove(item))
            {
                Monitor.PulseAll(padlock);
            }
        }
    }
}

(Completely untested, admittedly. You might also want to specify timeouts for the waiting code...)

Jon Skeet
+1, I came to a similar solution but the HashSet makes it more efficient. Nice demo for the Monitor class, using nested Locks etc. Only problem could be a 'stampede' when an item is removed.
Henk Holterman
+6  A: 

While #1 may be the simplest to write, it's essentially a useless method. Unless you are holding onto the same lock after finishing a query for "existence of an entry", you are actually returning "existence of an entry at some point in the past". It doesn't give you any information about the current existence of the entry.

In between the discovery of a value in the list then doing any operation to retrieve, remove the value, another thread could come and remove it for you.

Contains operations on a concurrent list should be combined with the operation you plan on doing in the case of true/false existence of that check. For instance TestAdd() or TestRemove() is much safer than Contains + Add or Contains + Remove

JaredPar
You should link to your blog post on thread safe collection interfaces.
Joel Coehoorn
Yes, but #4 does just that. The usability of the other methods is doubtful, but not necessarily wrong.
Henk Holterman
@Henk, I think #2-#4 are good methods. #1 though is not wrong but it's just not useful.
JaredPar
Well, here's the thing. There are some cases where if an entry exists, I need to simply skip some code and keep going.
Matt
@Matt, the problem is you can't think of the element as existing. You must think of it as "used to have existed"
JaredPar
Alright, so what should I do instead? The scenario I'm talking about is where a user can manually lock an object when editing, but a modification request can come in for the same object, and it should be ignored. Only if the object is not locked should the update request be processed.
Matt
@Matt, I'm not entirely sure I understand the scenario but would a TryUpdate be sufficient?
JaredPar
A: 

You should preferrably make each task into an atomic operation so that you can handle the locking inside the encapsulating object.

For the locking you just use the lock keyword, and an object that is private for the encapsulating object as identifier.

Example:

public class ConcurrentList<T> {

   private object _sync = new object();
   private List<T> _list = new List<T>();

   public bool Contains(T value) {
      lock (_sync) {
         return _list.Contains(value);
      }
   }

   public bool AddIfNotExists(T value) {
      lock (_sync) {
         if (_list.Contains(value)) {
            return false;
         } else {
            _list.Add(value);
            return true;
         }
      }
   }

   public void WaitUntilRemoved(T value) {
      while (Contains(value)) {
         Tread.Sleep(0);
      }
   }

   public void WaitToAdd(T value) {
      do {
         WaitUntilRemoved(value);
      } while (!AddIfNotExists(value));
   }

}

As concurrency problems is impossible to reproduce with certainty, it's impossible to make a unit test for them.

Guffa
Sleep(0) is a terribly inefficient way to wait. And the WaitToAdd should use a Lock so that it doesn't have to try again.
Henk Holterman
@Henk: Why do you think that Sleep(0) would be inefficient? The WaitToAdd can't have a lock, then it would wait forever as no other thread could remove the item that it's waiting for to be removed.
Guffa
Just take a look at Jon's answer, Monitor.Wait and .Pulse solve this effectively. And recursive locks are granted, so WaitAndAdd can use lock and then call WaitUntilRemoved that also locks.
Henk Holterman
@Henk: Yes, that is a different way of solving it. Why are you so afraid of having the code with the retry? And you still haven't explained why you think that Sleep(0) would be inefficient...
Guffa
Sleep(0) means "busy waiting", a lot of wasted CPU time. Sleep() isn't cheap, some sort of spin-wait could be better. But there is no indication items will disappear quickly, so Monitor.Wait is very much superior. The retry isn't all that bad, just unnecessary.
Henk Holterman
@Henk: No, Sleep(0) isn't busy while waiting. It yields the rest of the time slice. If you think that Sleep works like a timer, I understand if you don't understand what the code does.
Guffa
I understand it to be a relatively expensive way (going to unmanaged code) to give up the timeslice, and that it could wake up/give up rather often, wasting a lot of threadswitches. If you do insist on Slepp, use Sleep(1). See http://www.bluebytesoftware.com/blog/PermaLink,guid,1c013d42-c983-4102-9233-ca54b8f3d1a1.aspx
Henk Holterman
What is extremely inefficient here is not really the Sleep(0) or Sleep(1), it's the constant calls to Contains. That's what makes it "busy waiting", it wastes a lot of CPU time to constantly check if the item is present in the list. And that's in addition to the inefficiency of having that thread's context being switched in and out while waiting. There's really no reason to use that kind of busy waiting instead of Monitor.Wait / Monitor.Pulse.
Guillaume Poirier
+1  A: 

There is a product for finding race conditions and suchlike in unit tests. It's called TypeMock Racer. I can't say anything for or against its effectiveness, though. :)

Rytmis
A: 

here is a proper, concurrent, thread-safe, parallelisable concurrent list implementation http://www.deanchalk.me.uk/post/Task-Parallel-Concurrent-List-Implementation.aspx

Dean Chalk