views:

1532

answers:

7

In a multi-threaded program running on a multi-cpu machine do I need to access shared state ( _data in the example code below) using volatile read/writes to ensure correctness.

In other words, can heap objects be cached on the cpu?

Using the example below and assuming multi-threads will access the GetValue and Add methods, I need ThreadA to be able to add data (using the Add Method) and ThreadB to be able to see/get that added data immediately (using the GetValue method). So do I need to add volatile reads/writes to _data to ensure this? Basically I don’t want to added data to be cached on ThreadA’s cpu.

/ I am not Locking (enforcing exclusive thread access) as the code needs to be ultra-fast and I am not removing any data from _data so I don’t need to lock _data.

Thanks.

** Update **************************

Obviously you guys think going lock-free using this example is bad idea. But what side effects or exceptions could I face here?

Could the Dictionary type throw an exception if 1 thread is iterating the values for read and another thread is iterating the values for update? Or would I just experience “dirty reads” (which would be fine in my case)?

** End Update **************************

public sealed class Data
{
    private volatile readonly Dictionary<string, double> _data = new Dictionary<string, double>();

    public double GetVaule(string key)
    {
        double value;
        if (!_data.TryGetValue(key, out value))
        {
            throw new ArgumentException(string.Format("Key {0} does not exist.", key));
        }
        return value;
    }

    public void Add(string key, double value)
    {
        _data.Add(key, value);
    }

    public void Clear()
    {
        _data.Clear();
    }
}
+1  A: 

I don't think volatile can be a replacement of locking if you start calling methods on it. You are guaranteeing that the thread A and thread B sees the same copy of the dictionary, but you can still access the dictionary simultaneously. You can use multi-moded locks to increase concurrency. See ReaderWriterLockSlim for example.

Represents a lock that is used to manage access to a resource, allowing multiple threads for reading or exclusive access for writing.

eed3si9n
+8  A: 

The MSDN docs for Dictionary<TKey, TValue> say that it's safe for multiple readers but they don't give the "one writer, multiple readers" guarantee that some other classes do. In short, I wouldn't do this.

You say you're avoiding locking because you need the code to be "ultra-fast" - have you tried locking to see what the overhead is? Uncontested locks are very cheap, and when the lock is contested that's when you're benefiting from the added safety. I'd certainly profile this extensively before deciding to worry about the concurrency issues of a lock-free solution. ReaderWriterLockSlim may be useful if you've actually got multiple readers, but it sounds like you've got a single reader and a single writer, at least at the moment - simple locking will be easier in this case.

Jon Skeet
A: 

Thanks for the replies. Regarding the locks, the methods are pretty much constantly called by mulitple threads so my problem is with contested locks not the actual lock operation.

So my question is about cpu caching, can heap objects (the _data instance field) be cached on a cpu? Do i need the access the _data field using volatile reads/writes?

/Also, I am stuck with .Net 2.0.

Thanks for your help.

It's generally better to update the question rather than add "pseudo-answers" like this. Having said that, I stand by my suggestion that you should profile your app and check that it really is an issue when you lock before you try to go lock-free - especially given MSDN's lack of promised safety.
Jon Skeet
Writing lock free code _is_ rocket science. Even if you test it extensively, you'll not have proven its correctness. I agree with Jon, prove that you need lock free code with profiling before you open that can of worms.
Greg D
Yes, all memory can (will) be cached on a CPU. But if you're asking question like that, you should either not worry about it, or do some serious reading on how computers work. No you don't need volatile reads/writes. You need locking, see my answer.
Eloff
+2  A: 

The problem is that your add method:

public void Add(string key, double value)
{
    _data.Add(key, value);
}

Could cause _data to decide to completely re-organise the data it's holding - at that point a GetVaule request could fail in any possible way.

You need a lock or a different data structure / data structure implementation.

Douglas Leeder
+1  A: 

I think you may be misunderstanding the use of the volatile keyword (either that or I am, and someone please feel free to correct me). The volatile keyword guarantees that get and set operations on the value of the variable itself from multiple threads will always deal with the same copy. For instance, if I have a bool that indicates a state then setting it in one thread will make the new value immediately available to the other.

However, you never change the value of your variable (in this case, a reference). All that you do is manipulate the area of memory that the reference points to. Declaring it as volatile readonly (which, if my understanding is sound, defeats the purpose of volatile by never allowing it to be set) won't have any effect on the actual data that's being manipulated (the back-end store for the Dictionary<>).

All that being said, you really need to use a lock in this case. Your danger extends beyond the prospect of "dirty reads" (meaning that what you read would have been, at some point, valid) into truly unknown territory. As Jon said, you really need proof that locking produces unacceptable performance before you try to go down the road of lockless coding. Otherwise that's the epitome of premature optimization.

Adam Robinson
A: 

Volatile is not locking, it has nothing to do with synchronization. It's generally safe to do lock-free reads on read-only data. Note that just because you don't remove anything from _data, you seem to call _data.Add(). That is NOT read-only. So yes, this code will blow up in your face in a variety of exciting and difficult to predict ways.

Use locks, it's simple, it's safer. If you're a lock-free guru (you're not!), AND profiling shows a bottleneck related to contention for the lock, AND you cannot solve the contention issues via partitioning or switching to spin-locks THEN AND ONLY THEN can you investigate a solution to get lock-free reads, which WILL involve writing your own Dictionary from scratch and MAY be faster than the locking solution.

Are you starting to see how far off base you are in your thinking here? Just use a damn lock!

Eloff
A: 

The volatile keyword is not about locking, it is used to indicate that the value of the specified field might be changed or read by different thread or other thing that can run concurrently with your code. This is crucial for the compiler to know, because many optimization processes involve caching the variable value and rearranging the instructions. The volatile keyword will tell the compiler to be "cautious" when optimizing those instructions that reference to volatile variable.

For multi-thread usage of dictionary, there are many ways to do. The simplest way is using lock keyword, which has adequate performance. If you need higher performance, you might need to implement your own dictionary for your specific task.

tia