views:

381

answers:

5

Lets just say you have a simple operation that runs on a background thread. You want to provide a way to cancel this operation so you create a boolean flag that you set to true from the click event handler of a cancel button.

private bool _cancelled;

private void CancelButton_Click(Object sender ClickEventArgs e)
{
    _cancelled = true;
}

Now you're setting the cancel flag from the GUI thread, but you're reading it from the background thread. Do you need to lock before accessing the bool?

Would you need to do this (and obviously lock in the button click event handler too):

while(operationNotComplete)
{
    // Do complex operation

    lock(_lockObject)
    {
        if(_cancelled)
        {
            break;
        }
    }
}

Or is it acceptable to do this (with no lock):

while(!_cancelled & operationNotComplete)
{
    // Do complex operation
}

Or what about marking the _cancelled variable as volatile. Is that necessary?

[I know there is the BackgroundWorker class with it's inbuilt CancelAsync() method, but I'm interested in the semantics and use of locking and threaded variable access here, not the specific implementation, the code is just an example.]

There seems to be two theories.

1) Because it is a simple inbuilt type (and access to inbuilt types is atomic in .net) and because we are only writing to it in one place and only reading on the background thread there is no need to lock or mark as volatile.
2) You should mark it as volatile because if you don't the compiler may optimise out the read in the while loop because it thinks nothing it capable of modifying the value.

Which is the correct technique? (And why?)

[Edit: There seem to be two clearly defined and opposing schools of thought on this. I am looking for a definitive answer on this so please if possible post your reasons and cite your sources along with your answer.]

+5  A: 

_cancelled must be volatile. (if you don't choose to lock)

If one thread changes the value of _cancelled, other threads might not see the updated result.

Also, I think the read/write operations of _cancelled are atomic:

Section 12.6.6 of the CLI spec states: "A conforming CLI shall guarantee that read and write access to properly aligned memory locations no larger than the native word size is atomic when all the write accesses to a location are the same size."

bruno conde
Essentially the .NET memory model allows for non-volatile reads\writes to be reordered as long as that change can not be noticed from the point of view of a single thread.
Mitch Wheat
@Mitch: but in this case this means it can screw things up.
EFraim
Thanks to those who pointed out my mistake! Sorry for any confusion spread....
Mitch Wheat
Just as a point of correction, locking will have no effect upon whether a read or write to a variable is cached. Locking is NOT a replacement for a volatile variable.
Adam Robinson
atomicity has little to do with it... if the other thread is still chewing on its own registers then it makes zero difference whether you write the value in 1 chunk or in 26.
Marc Gravell
@Adam - the lock acts, I believe, as a memory barrier - serving the same purpose.
Marc Gravell
@Marc: Monitor.Enter (what lock() does, as you know but others may not) only acquires an exclusive lock, at least according to the documentation. I'm not sure how it could serve as a memory barrier, at least certainly on anything other than what is being locked. The logic required to do that would have to be forward-looking (to see WHAT needs to be protected, since "flushing" all cache and writes would seem to be inefficient) and at a casual glance doesn't really seem possible to my admittedly small mind ;) Is there information somewhere that confirms this?
Adam Robinson
Simon P Stevens
@Simon: Marc may have more information on this, but from my limited knowledge on the subject, a lock (which is only to be applied to a reference type) is just a means by which you can easily ensure that only one thread is accessing a particular reference at a time. A volatile field is used to ensure that reads and writes to that field (assigning the value directly) go against the actual memory location and not a thread-specific cache.
Adam Robinson
Using lock() yields the same guarantees as making a field volatile.
Daniel Brückner
@Marc, Daniel: Thanks for educating me on this. After doing some reading, I realize now that the semantics of the lock (in that it performs a volatile read on the locked object at the start, then a volatile write at the end) mean that the value of the referenced variable MUST be reread at least once after the lock is initiated. I still think it's probably a good idea to mark a variable that's used in this manner as volatile so that it's painfully clear (and some subtlety of the jitter doesn't bite you, as I'm sure is bound to happen someday). Thanks!
Adam Robinson
+2  A: 

For thread synchronization, it's recommended that you use one of the EventWaitHandle classes, such as ManualResetEvent. While it's marginally simpler to employ a simple boolean flag as you do here (and yes, you'd want to mark it as volatile), IMO it's better to get into the practice of using the threading tools. For your purposes, you'd do something like this...

private System.Threading.ManualResetEvent threadStop;

void StartThread()
{
    // do your setup

    // instantiate it unset
    threadStop = new System.Threading.ManualResetEvent(false); 

    // start the thread
}

In your thread..

while(!threadStop.WaitOne(0) && !operationComplete)
{
    // work
}

Then in the GUI to cancel...

threadStop.Set();
Adam Robinson
An OS primitive is usually overkill for a single app, IMO. Most things can be done with just a `Monitor`.
Marc Gravell
@Marc: While a *ResetEvent may not be explicitly required here, I don't think I'd be willing to say that it's overkill for a single application. There are many instances where multiple threads within one application may need to use the functionality of a EventWaitHandle like ManualResetEvent.
Adam Robinson
+1  A: 

Look up Interlocked.Exchange(). It does a very fast copy into a local variable which can be used for comparison. It is faster than lock().

hughdbrown
It's not the speed considerations; it's the correctness....
Mitch Wheat
+1, not always the best but it is one of the three (lock/interlocked/volatile) possibilities.
Henk Holterman
Interlocked.Exchange() would not be my first choice for a variable used to control a loop. I'd use an Event/WaitHandle. If you need to get a single integer/boolean variable atomically, it is a good choice.
hughdbrown
+3  A: 

Locking is not required because you have a single writer scenario and a boolean field is a simple structure with no risk of corrupting the state (while it was possible to get a boolean value that is neither false nor true). But you have to mark the field as volatile to prevent the compiler from doing some optimizations. Without the volatile modifier the compiler could cache the value in a register during the execution of your loop on your worker thread and in turn the loop would never recognize the changed value. This MSDN article addresses this issue. While there is need for locking, a lock will have the same effect as marking the field volatile.

Daniel Brückner
That's right. Since you either read or set the value of this flag, all you need is a `volatile` attribute.
jpbochi
But in theory you could have a quite weired scenario if the update of the boolean field is not atomic and a partial update results in a value that is neither true nor false, but this would cause no problems in the given usage scenario.
Daniel Brückner
@Daniel - a bool update is guaranteed to be atomic by the spec.
Marc Gravell
ECMA334v4 §12.5 Atomicity of variable referencesReads and writes of the following data types shall be atomic: bool, char, byte, sbyte, short, ushort,uint, int, float, and reference types. In addition, reads and writes of enum types with an underlying typein the previous list shall also be atomic.
Marc Gravell
Thanks for that. Just had a look at RFC 2119 and noticed that I tended to interpret "shall" like "should" because it similar to the German word for "should".
Daniel Brückner
+9  A: 

Firstly, threading is tricky ;-p

Yes, despite all the rumours to the contrary, it is required to either use lock or volatile (but not both) when accessing a bool from multiple threads.

For simple types and access such as an exit flag (bool), then volatile is sufficient - this ensures that threads don't cache the value in their registers (meaning: one of the threads never sees updates).

For larger values (where atomicity is an issue), or where you want to synchronize a sequence of operations (a typical example being "if not exists and add" dictionary access), a lock is more versatile. This acts as a memory-barrier, so still gives you the thread safety, but provides other features such as pulse/wait. Note that you shouldn't use a lock on a value-type or a string; nor Type or this; the best option is to have your own locking object as a field (readonly object syncLock = new object();) and lock on this.

For an example of how badly it breaks (i.e. looping forever) if you don't synchronize - see here.

To span multiple programs, an OS primitive like a Mutex or *ResetEvent may also be useful, but this is overkill for a single exe.

Marc Gravell
I think this settles the issue. A reproducible example there of how it can go wrong if you don't use lock or volatile.
Simon P Stevens