The current C++0x draft states on section 29.3.9 and 29.3.10, pages 1111-1112 that in the following example:
// Thread 1
r1 = y.load(memory_order_relaxed);
x.store(1, memory_order_relaxed);
// Thread 2
r2 = x.load(memory_order_relaxed);
y.store(1, memory_order_relaxed);
The outcome r1 = r2 = 1
is possible since the operations of each thread are relaxed and to unrelated addresses. Now my question is about the possible outcomes of the following (similar) example:
// Thread 1
r1 = y.load(memory_order_acquire);
x.store(1, memory_order_release);
// Thread 2
r2 = x.load(memory_order_acquire);
y.store(1, memory_order_release);
I think that in this case the outcome r1 = r2 = 1
is not possible. If it was possible, the load of y would synchronize-with (thus happen-before) the store to y. Similar to x, the load of x would happen-before the store to x. But the load of y is sequenced before (thus also happens-before) the store to x. This creates a cyclic happens-before relation which I think is not allowed.