views:

446

answers:

3

It is generally accepted (I believe!) that a lock will force any values from fields to be reloaded (essentially acting as a memory-barrier or fence - my terminology in this area gets a bit loose, I'm afraid), with the consequence that fields that are only ever accessed inside a lock do not themselves need to be volatile.

(If I'm wrong already, just say!)

A good comment was raised here, questioning whether the same is true if code does a Wait() - i.e. once it has been Pulse()d, will it reload fields from memory, or could they be in a register (etc).

Or more simply: does the field need to be volatile to ensure that the current value is obtained when resuming after a Wait()?

Looking at reflector, Wait calls down into ObjWait, which is managed internalcall (the same as Enter).

The scenario in question was:

bool closing;
public bool TryDequeue(out T value) {
    lock (queue) { // arbitrary lock-object (a private readonly ref-type)
        while (queue.Count == 0) {
            if (closing) {       // <==== (2) access field here
                value = default(T);
                return false;
            }
            Monitor.Wait(queue); // <==== (1) waits here
        }
        ...blah do something with the head of the queue
    }
}

Obviously I could just make it volatile, or I could move this out so that I exit and re-enter the Monitor every time it gets pulsed, but I'm intrigued to know if either is necessary.

+13  A: 

Since the Wait() method is releasing and reacquiring the Monitor lock, if lock performs the memory fence semantics, then Monitor.Wait() will as well.

To hopefully address your comment:

The locking behavior of Monitor.Wait() is in the docs (http://msdn.microsoft.com/en-us/library/aa332339.aspx), emphasis added:

When a thread calls Wait, it releases the lock on the object and enters the object's waiting queue. The next thread in the object's ready queue (if there is one) acquires the lock and has exclusive use of the object. All threads that call Wait remain in the waiting queue until they receive a signal from Pulse or PulseAll, sent by the owner of the lock. If Pulse is sent, only the thread at the head of the waiting queue is affected. If PulseAll is sent, all threads that are waiting for the object are affected. When the signal is received, one or more threads leave the waiting queue and enter the ready queue. A thread in the ready queue is permitted to reacquire the lock.

This method returns when the calling thread reacquires the lock on the object.

If you're asking about a reference for whether a lock/acquired Monitor implies a memory barrier, the ECMA CLI spec says the following:

12.6.5 Locks and Threads:

Acquiring a lock (System.Threading.Monitor.Enter or entering a synchronized method) shall implicitly perform a volatile read operation, and releasing a lock (System.Threading.Monitor.Exit or leaving a synchronized method) shall implicitly perform a volatile write operation. See §12.6.7.

12.6.7 Volatile Reads and Writes:

A volatile read has "acquire semantics" meaning that the read is guaranteed to occur prior to any references to memory that occur after the read instruction in the CIL instruction sequence. A volatile write has "release semantics" meaning that the write is guaranteed to happen after any memory references prior to the write instruction in the CIL instruction sequence.

Also, these blog entries have some details that might be of interest:

Michael Burr
That was my implicit assumption, but I was hoping for some kind of citation / reference...?
Marc Gravell
+1 as it's basically what I was going to say (though I've added some additional reasoning).
Daniel Earwicker
This doesn't address the issue. The issue is JIT code generation, not cache/memory behavior. How does a method call prevent the JITter from generating code that stores a variable in a register?
Hans Passant
@nobugz - A method call can do that. The JIT can easily recognise and treat the `Monitor.Lock` function as a special indicator. It's the same in C++: you can write something that looks like a function call to `MemoryBarrier` which is actually just a macro that inlines some assembly: `xchg ...` and which the compiler knows to tread carefully around when it sees it.
Daniel Earwicker
@Earwicker: maybe it can. Does it? And does it do so for every architecture? Have you seen this documented anywhere so it is something we can count on? Or does the lack of such documentation require the use of volatile?
Hans Passant
My reference for threading in .NET and Windows generally is Joe Duffy's book *Concurrent Programming on Windows*. He was the PM for concurrency in the CLR, the developer of Parallel Linq the lead architect of Parallel Extensions for .NET, so I'm inclined take his word for it... Page 484: "Critical region primitives, such as Win32's critical section and the CLR's monitor, *work with the compiler*, CPU and memory system... All correctly written synchronisation primitives do this."
Daniel Earwicker
@nobugz - while the JIT might have some special recognition of `Monitor`, that shouldn't be necessary for proper operation. If the JIT inlines the behavior of `Monitor` then it will see the volatile access behavior of releasing/acquiring the lock and have to deal with the memory barrier. If the JIT doesn't see 'inside' `Monitor` then it must ensure that `queue.Count` is reloaded (or called again) anyway since the object might have been modified - even on the same thread.
Michael Burr
@Michael: this is about the "closing" variable, not the lock and not queue.Count. I'd love to see a direct reference that states that registers are reloaded after a Wait call (or any call) but Ecma-355 is completely silent about it. This is a JIT implementation detail and JIT details are undocumented. I'll personally have to stick with volatile to be safe and not sorry, I can't see a good reason to intentionally omit it. Well, Interlocked actually.
Hans Passant
@nobugz - same would apply to the `closing` variable. The fact that it is accessible from something other than `TryDequeue()`, whether on the same thread or not, means the JIT would have to behave as outlined in my comment that discussed `queue.Count`. The key is that if the JIT inlines enough to see the memory barrier, then it needs to deal with that. If the JIT doesn't see the memory barrier at the point that it's JIT-ing `TryDequeue()` then it needs to reload `closing` because it has to assume that whatever method it called may have caused `closing` to change.
Michael Burr
@nobugz: and ECMA-355 is not silent about this. The sections of ECMA that I quoted indicate that the behavior of the lock used in the `Wait()` call require the memory barrier acquire and release semantics. While details of JIT implementation might be undocumented, it would have do whatever was necessary to follow that specification.
Michael Burr
@nobugz see also the reference that Reed Copsey dug out. I think between this and that, I'm satisfied that the `volatile` is *not* required (although I've left a question on Igor's blog too ;-p).
Marc Gravell
@Marc: that also talks about the CPU cache, you proved in your original link in your first comment that the cache has nothing to do with it. The thread could only block permanently if the cache would *never* get refreshed. That's not possible. I don't think I managed to get my point across, but that's okay.
Hans Passant
+4  A: 

Further to Michael Burr's answer, not only does Wait release and re-acquire the lock, but it does this so that another thread can take out the lock in order to examine the shared state and call Pulse. If the second thread doesn't take out the lock then Pulse will throw. If they don't Pulse the first thread's Wait won't return. Hence any other thread's access to the shared state must happen within a proper memory-barried scenario.

So assuming the Monitor methods are being used according to the locally-checkable rules, then all memory accesses happen inside a lock, and hence only the automatic memory barrier support of lock is relevant/necessary.

Daniel Earwicker
+1  A: 

Maybe I can help you this time... instead of using a volatile you can use Interlocked.Exchange with an integer.

if (closing==1) {       // <==== (2) access field here
    value = default(T);
    return false;
}

// somewhere else in your code:
Interlocked.Exchange(ref closing, 1);

Interlocked.Exchange is a synchronization mechanism, volatile isn't... I hope that's worth something (but you probably already thought about this).

Lirik
Indeed, but `Monitor` is also a synchronization mechanism ;-p (besides: I would expect `volatile` to be more direct in this case)
Marc Gravell
It's simpler anyway to just use the Monitor Wait/Pulse pattern consistently. A Wait loop is a wait for a change in the shared mutable state. So anything that affects the outcome of that wait must be modified inside a lock, and must call `Pulse`. Same goes for modifications to `closing`.
Daniel Earwicker
It's 3:00 a.m. and I'm wondering: http://meta.stackoverflow.com/questions/11652/how-addicted-to-stack-overflow-are-you
Lirik
@Lirik - only for you. The world is a big place.
Marc Gravell
It was 9am GMT - which is the most *important* time zone. :p
Daniel Earwicker