views:

89

answers:

3

I have a class that contains some data and there are many threads use it:

class MyClass
{
    static Dictionary<Key, Value> MyData;

    static IEnumerable<Data> Data
    {
        get
        {
            return MyData.Values;
        }
    }

    static void Reset()
    {
         MyData = GetMyData();
    }
}

Sometime (say once in a day) the Reset method is called. I don't want to add locking because of performance, but not sure if everything will work without it.

The question is: should I use any type of locking in this class?

+4  A: 

If it is going to be called by multiple threads then: yes.

A Monitor (the type that a lock statement uses) has very low overhead if there is little contention. Create a private Object instance to use with the lock and ensure all access paths are protected.

Another possibility (on .NET 4) is a ConcurrentDictionary

Edit Additional: I note that the Data property returns an IEnumerable over the internal content of your Dictionary. This means that an iteration over the dictionary's values could be ongoing when other modifying methods are called. Even with internal locking you would have concurrency problems. There are two approaches:

  1. Move the locking into the callers of this type. If it is an internal or helper type (rather than exposed as part of an API) this could be a workable approach, but it makes the burden of ensuring correct locking more difficult.

  2. Make a copy of the values, and return that:

    static IEnumerable<Data> Data {
      get {
          return MyData.Values.ToArray();
      }
    }
    
Richard
Can you please comment Allon's answer as it's opposite to your one?
Kamarey
@Kamarey: will do.
Richard
+1  A: 

I disagree with Richard. You seem to use the Dictionary<,> in an immutable fashion, never changing the content. It is returned completely populated from the GetMyData() method and simply replaces a reference on MyData, while only exposing the immutable Values property. Even if someone is enumerating over the dictionary while someone else calls Reset(), the person will continue enumerating over the old dictionary while any new reader will get the new dictionary. So your code is completely thread-safe, thanks to the fact that that you use data structures as if they were immutable. You might want to wrap it in a real immutable dictionary (e.g. from here) If, on the other hand, there are different usages in other parts of the code that we can't see, then it would be a different matter.

EDIT: I just noticed that both MyData and Data are private (then why even bother with an extra property?), so you're also might be using MyData to modify the dictionary. In that case, as long as all the dictionary read/writes happen on the same thread, but the Reset() method is called from a different thread, then you're okay. If, on the other hand, you're modifying the collection on one thread while reading from it on another thread, then you'll have to perform manual synchronization as Richard mentioned. The point is that the Reset() method is never the issue, since it replaces the reference of the dictionary with a new instance without affecting the old instance.

Allon Guralnek
You got my point right. I asked this question because of I'm not sure if the property Data is thread safe (how it will work, if while enumerating values the Reset will be called?). But because of no one modifies the MyData except the Reset method, which only replace reference to a new memory, I doubt if it's safe to skip locking. And the Data property and Reset method actually public, just didn't specified it here.
Kamarey
@Kamarey: If no one modifies `MyData` except the `Reset()` method, then **it is** safe to forgo using locks. That is the whole rational behind using immutable objects - that they are intrinsically thread safe.
Allon Guralnek
A quick note on why I answered as I did. As shown the dictionary will never have any data (the only modifying operation is `Clear`, so therefore I *assumed* this was just an excerpt and answered in a general fashion. In general with threading, unless you are very knowledgeable (e.g. difference being acquire and release semantics of partial memory barriers is in your vocabulary) it is far better to be extremely conservative. Loss of performance in hot code can be handled later in the unlikely case it is an issue while concurrency bugs can be extremely difficult to fix.
Richard
@Richard: Actually, there is no modifying operation at all. And again, I don't agree - you don't have to be very knowledgeable to understand the simple concept that immutable objects are thread-safe. It is actually *harder* to get synchronization right than to simply avoid the issue wholesale by using immutable objects. E.g. Most threading newbies still don't understand what the `lock` keyword locks, or rather that it doesn't lock anything at all. Of course you still have to use locks if you want to have an atomic operation that is larger than an assignment, but that is not the case here.
Allon Guralnek
@Richard: agree with your arguments, but in this given case Allon is right. Also did some tests and found that it works without locking as expected.
Kamarey
@Allon: I'm not a guru in threading, but I know that except locking itself, locks (or monitors) also define memory barriers, which is important on multi core/cpu systems. But looks that it works without locking...
Kamarey
@Kamarey: Testing isn't a good yardstick to tell if a piece of code is thread-correct or not. It can sometimes be very hard to locate threading-related bugs via testing, since they depend on the timing of the scheduler, which you can't control (except with certain tools such as [CHESS](http://research.microsoft.com/CHESS/)), and so latent bugs can easily slip through. Instead, a thorough understanding of the issues involved will let you take matters into your own hand. For example, you could read up on what threading problems immutable objects solve and what problems they don't solve.
Allon Guralnek
+1  A: 

Yes. You should make the code safe for multithreaded operations. The reason is because it is standard practice to make all static members thread-safe whether you expect the code to be run in multithreaded environment or not. Your current code has a staleness problem. That is one thread may call Reset, but other threads calling Data may never see the change. This can be fixed easily by marking MyData as volatile.

Update:

The issue of staleness that I am talking about is related to how the C# and JIT compilers optimize code. For example, consider two threads TA and TB. TA calls MyClass.Data and thus reads the MyData reference while TB changes the MyData reference by calling Reset. The compiler is free to reorder the reads and writes of MyData by lifting them outside of loops, caching them in a CPU register, etc. What this means is that TA might miss a change in MyData if it caches the reference in a CPU register or if TB does not commit its write to main memory immediately.

This is not just some theoretical problem that rarely occurs in practice. In fact, it is quite easy to demonstrate with the following program. Make sure you compile the code in the Release configuration and run it without the debugger (both are mandatory to reproduce the problem). A cursory glance suggests that this program should terminate in about 1 second, but alas it does not because m_Stop is being cached by the worker thread and never sees that the main thread changed its value to true. The simple fix is mark m_Stop as volatile. You can see my explanation here for information about the concept of memory barriers and the reordering of instructions.

public class Program
{
    private static bool m_Stop = false;

    public static void Main(string[] args)
    {
        var thread = new Thread(
            () =>
            {
                int i = 0;
                Console.WriteLine("begin");
                while (!m_Stop)
                {
                    i++;
                }
                Console.WriteLine("end");
            });
        thread.Start();
        Thread.Sleep(1000);
        m_Stop = true;
        Console.WriteLine("exit");
    }
}
Brian Gideon
Can you explain why do you say that Data property may never see changes?
Kamarey
@Kamarey: I updated my answer.
Brian Gideon