views:

122

answers:

1

The current C++0x draft states on section 29.3.9 and 29.3.10, pages 1111-1112 that in the following example:

// Thread 1
r1 = y.load(memory_order_relaxed);
x.store(1, memory_order_relaxed);

// Thread 2
r2 = x.load(memory_order_relaxed);
y.store(1, memory_order_relaxed);

The outcome r1 = r2 = 1 is possible since the operations of each thread are relaxed and to unrelated addresses. Now my question is about the possible outcomes of the following (similar) example:

// Thread 1
r1 = y.load(memory_order_acquire);
x.store(1, memory_order_release);

// Thread 2
r2 = x.load(memory_order_acquire);
y.store(1, memory_order_release);

I think that in this case the outcome r1 = r2 = 1 is not possible. If it was possible, the load of y would synchronize-with (thus happen-before) the store to y. Similar to x, the load of x would happen-before the store to x. But the load of y is sequenced before (thus also happens-before) the store to x. This creates a cyclic happens-before relation which I think is not allowed.

+2  A: 

If we take time (or, instruction sequences if you like) to flow downward, just like reading code, then my understanding is that

  • An acquire fence allows other memory accesses to move downwards past the fence, but not upwards past the fence
  • A release fence allows other memory accesses to move upwards past the fence, but not downwards past the fence

In other words, if you have code like

acquire
// other stuff
release

then memory accesses may move from outside the acquire/release pair to the inside, but not the other way around (and they may not skip the acquire/release pair completely either).

With the relaxed consistency semantics in your first example in the question, the hardware can reorder memory accesses such that the stores enter the memory system before the loads, thus allowing r1=r2=1. With the acquire/release semantics in the second example, that reordering is prevented, and thus r1=r2=1 is not possible.

janneb
I'm not sure I understand your answer.
Helltone
Hmm, does my clarification help? If not, what specifically do you not understand?
janneb
Take my second example, replace releases by relaxeds, acquires stays acquires. Is r1=r2=1 possible? Now restart from the initial version of second example again, but this time replace acquires by relaxed and keep releases as is. Is r1=r2=1 possible?
Helltone
Hmm, I think in the second example, only one of the operations for each thread needs to be restricted in order to prevent r1=r2=1. In a more complicated example you probably want both, because you have some stuff between the acquire/release pair that is not allowed to "leak" out. More generally however, anything beyond the default sequential consistency semantics is best left to experts who are developing high-performance locking mechanisms or lock-free algorithms, that can then be used by mere mortals.
janneb
I'm not a mere mortal :-) Thanks for the clarifications.
Helltone