+1  A: 

C/C++ sequence points occur, for example, when ';' is encountered. At which point all side-effects of all operations that preceded it must occur. However, I'm fairly certain that by "side-effect" what's meant is operations that are part of the language itself (like z being incremented in 'z++') and not effects at lower/higher levels (like what the OS actually does with regards to memory management, thread management, etc. after an operation is completed).

Does that answer your question kinda? My point is really just that AFAIK the concept of sequence points doesn't really have anything to do with the side effects you're referring to.

hth

Assaf Lavie
A: 

see something in [linux-kernel]/Documentation/memory-barriers.txt

EffoStaff Effo
don't find that (at least in 2.6.16).
Peeter Joot
oops, my kernel distro are 2.6.23.1 and above
EffoStaff Effo
+4  A: 

No, sequence points do not prevent rearranging of operations. The main, most broad rule that governs optimizations is the requirement imposed on so called observable behavior. Observable behavior, by definition, is read/write accesses to volatile variables and calls to library I/O functions. These events must occur in the same order and produce the same results as they would in the "canonically" executed program. Everything else can be rearranged and optimized absolutely freely by the compiler, in any way it sees fit, completely ignoring any orderings imposed by sequence points.

Of course, most compilers are trying not to do any excessively wild rearrangements. However, the issue you are mentioning has become a real practical issue with modern compilers in recent years. Many implementations offer additional implementation-specific mechanisms that allow user to ask the compiler not to cross certain boundaries when doing optimizing rearrangements.

Since, as you are saying, the protected data is not declared as volatile, formally speaking the access can be moved outside of the protected region. If you declare the data as volatile, it should prevent this from happening (assuming that mutex access is also volatile).

AndreyT
We have some very scary optimizations enabled on our product (like the xlC -ipa optimizations that allow global optimization and inlining across function and even file boundaries regardless of whether the function is lined). We have mechanisms available that restrict this inlining, but I don't believe they restrict the compiler from knowing what side effects occur in the functions when the intra-procedural-analysis source graph happens to include it.Are library I/O functions given special status in the C/C++ standards, or does this extend generally to functions with unknown side effects?
Peeter Joot
Yes, standard library I/O functions have special status. The language specification also says, that any implementation that adds its own I/O functions should also treat them as producing "observable behavior".
AndreyT
+1. Although mutexes aren't of course specified by C++ so we cannot really formally say what happens here beyond assuming volatileness of both the mutex and the data. In POSIX at least (if i remember correctly), it is allowed that the compiler slurps statements from outside the critical section into the critical section. It however doesn't allow promoting statements from within the critical section outwards (obviously). I like this video: http://www.youtube.com/watch?v=mrvAqvtWYb4
Johannes Schaub - litb
+7  A: 

In short, the compiler is allowed to reorder or transform the program as it likes, as long as the observable behavior on a C++ virtual machine does not change. The C++ standard has no concept of threads, and so this fictive VM only runs a single thread. And on such an imaginary machine, we don't have to worry about what other threads see. As long as the changes don't alter the outcome of the current thread, all code transformations are valid, including reordering memory accesses across sequence points.

understanding why volatile does not appear to be a requirement for the mutex protected Data is really part of this question

Volatile ensures one thing, and one thing only: reads from a volatile variable will be read from memory every time -- the compiler won't assume that the value can be cached in a register. And likewise, writes will be written through to memory. The compiler won't keep it around in a register "for a while, before writing it out to memory".

But that's all. When the write occurs, a write will be performed, and when the read occurs, a read will be performed. But it doesn't guarantee anything about when this read/write will take place. The compiler may, as it usually does, reorder operations as it sees fit (as long as it doesn't change the observable behavior in the current thread, the one that the imaginary C++ CPU knows about). So volatile doesn't really solve the problem. On the other hand, it offers a guarantee that we don't really need. We don't need every write to the variable to be written out immediately, we just want to ensure that they get written out before crossing this boundary. It's fine if they're cached until then - and likewise, once we've crossed the critical section boundary, subsequent writes can be cached again for all we care -- until we cross the boundary the next time. So volatile offers a too strong guarantee which we don't need, but doesn't offer the one we do need (that reads/writes won't get reordered)

So to implement critical sections, we need to rely on compiler magic. We have to tell it that "ok, forget about the C++ standard for a moment, I don't care what optimizations it would have allowed if you'd followed that strictly. You must NOT reorder any memory accesses across this boundary".

Critical sections are typically implemented via special compiler intrinsics (essentially special functions that are understood by the compiler), which 1) force the compiler to avoid reordering across that intrinsic, and 2) makes it emit the necessary instructions to get the CPU to respect the same boundary (because the CPU reorders instructions too, and without issuing a memory barrier instruction, we'd risk the CPU doing the same reordering that we just prevented the compiler from doing)

jalf
A: 

Hi, Let look at the following example:

my_pthread_mutex_lock( &m ) ;
someNonVolatileGlobalVar++ ;
my_pthread_mutex_unlock( &m ) ;

The function my_pthread_mutex_lock() is just calling pthread_mutex_lock(). By using my_pthread_mutex_lock(), I'm sure the compiler doesn't know that it's a synchronization function. For the compiler, it's just a function, and for me, it's a synchronization function that I can easily reimplement. Because someNonVolatileGlobalVar is global, I expected the compiler doesn't move someNonVolatileGlobalVar++ outside the critical section. In fact, due to the observable behavior, even in a single thread situation, the compiler doesn't know if the function before and the one after this instruction are modifying the global var. So, to keep the observable behavior correct, it has to keep the execution order as it is written. I hope pthread_mutex_lock() and pthread_mutex_unlock() also perform hardware memory barriers, in order to prevent the hardware moving this instruction outside the critical section.

do I am right ?

If I write:

my_pthread_mutex_lock( &m ) ;
someNonVolatileGlobalVar1++ ;
someNonVolatileGlobalVar2++ ;
my_pthread_mutex_unlock( &m ) ;

I cannot know which one of the two variables is incremented first, but this is normally not an issue.

Now, if I write:

someGlobalPointer = &someNonVolatileLocalVar;
my_pthread_mutex_lock( &m ) ;
someNonVolatileLocalVar++ ;
my_pthread_mutex_unlock( &m ) ;

or

someLocalPointer = &someNonVolatileGlobalVar;
my_pthread_mutex_lock( &m ) ;
(*someLocalPointer)++ ;
my_pthread_mutex_unlock( &m ) ;

Is the compiler doing what an ingenuous developer expects ?

Gabriele