views:

433

answers:

9

Hello,

I was looking up the keyword volatile and what it's for, and the answer I got was pretty much:

It's used to prevent the compiler from optimizing away code.

There were some examples, such as when polling memory-mapped hardware: without volatile the polling loop would be removed as the compiler might recognize that the condition value is never changed. But since there only were one example or maybe two, it got me thinking: Are there other situations where we need to use volatile in terms of avoiding unwanted optimization? Are condition variables the only place where volatile is needed?

I imagine that optimization is compiler-specific and therefore is not specified in the C++ specification. Does that mean we have to go by gut feeling, saying Hm, I suspect my compiler will do away with this if I don't declare that variable as volatile or are there any clear rules to go by?

+11  A: 

Basically, volatile announces that a value might change behind your program's back. That prevents compilers from caching the value (in a CPU register) and from optimizing away accesses to that value when they seem unnecessary from the POV of your program.

What should trigger usage of volatile is when a value changes despite the fact that your program hasn't written to it, and when no other memory barriers (like mutexes as used for multi-threaded programs) are present.

sbi
While I think almost all answers to this question are more or less useful (and I would like to have bundled them all up in one to accept, hehe), I would say that this one sums it up.
gablin
+3  A: 

Volatile doesn't try to keep data to a cpu register (100's of times faster than memory). It has to read it from memory every time it is used.

Byron Whitlock
`volatile` doesn't exclude the value from L1 cache, which costs only a few cycles to access. It is associated with other mechanisms that do, though. Device registers will always be volatile, and will often be even slower than DRAM.
Potatoswatter
@Potatoswatter Isn't the L1 cache is controlled by hardware? I wasn't aware that software could affect anything in there.
Byron Whitlock
@Potatoswatter: While it is true that there is no real need for a volatile variable to make it all the way to real memory (might depend on the architecture), the fact is that it can have an impact much greater than a few cycles. If the variable is in the same cache line than any variable in use by other CPU, each operation on the `volatile` variable will trigger cache synchronization to the other CPUs and that can be costly both in the `volatile` itself and the non-volatile vars in the same cache line.
David Rodríguez - dribeas
@Byron: Hardware configuration settings are set by software. The OS can call up the MMU and turn off caching for given pages. There might even be a user-space facility to let any program do so.
Potatoswatter
@David: Yes, but that also applies to non-volatile variables that didn't happen to be subject to optimization.
Potatoswatter
@Potatoswatter: I agree, but the thing is that multiple reads of the same non-volatile variable inside the same function *can* be optimized into a single read into a register. In that scenario, the `volatile` keyword might trigger many synchs that in the non-volatile case would not be performed. That is, it will not just turn a register operation into a L1 cache read, but can cascade and have a much greater impact.
David Rodríguez - dribeas
+6  A: 

Condition variables are not where volatile is needed; strictly it is only needed in device drivers.

volatile guarantees that reads and writes to the object are not optimized away, or reordered with respect to another volatile. If you are busy-looping on a variable modified by another thread, it should be declared volatile. However, you shouldn't busy-loop. Because the language wasn't really designed for multithreading, this isn't very well supported. For example, the compiler may move a write to a non-volatile variable from after to before the loop, violating the lock. (For indefinite spinloops, this might only happen under C++0x.)

When you call a thread-library function, it acts as a memory fence, and the compiler will assume that any and all values have changed — essentially everything is volatile. This is either specified or tacitly implemented by any threading library to keep the wheels turning smoothly.

C++0x might not have this shortcoming, as it introduces formal multithreading semantics. I'm not really familiar with the changes, but for the sake of backward compatibility, it doesn't require to declare anything volatile that wasn't before.

Potatoswatter
Johannes Schaub - litb
Johannes Schaub - litb
It interprets that such that actually the *access path* volatileness is enough to make an access observable, but C++03 also said "The least requirements on a conforming implementation are: [..] At sequence points, volatile objects are stable in the sense that previous evaluations are complete and subsequent evaluations have not yet occurred.". Note that this text is clear that only access to *volatile objects* and not the *access path* alone determine whether the access is observable behavior or not.
Johannes Schaub - litb
Finally, C++0x is the most clear and just says "Access to volatile objects are evaluated strictly according to the rules of the abstract machine.". It does not try to define the observable behavior multiple times anymore, thanks to [DR #612](http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_defects.html#612). So trying to add observable behavior by using hideous casts actually doesn't work.
Johannes Schaub - litb
@Johannes: I can't find it in the standard or by Googling SO with my own name. Removed from the answer. I swear I saw a bulletproof argument… However 12.8/15 seems to imply the existence of such. I'm not sure it's necessary to allow local variables to independently constitute observable behavior. And that article seems to be using "everything" being volatile as a reductio ad absurdum.
Potatoswatter
Fair nuff about the casting. What I thought I saw was a rule against *declaring* a volatile variable with automatic storage.
Potatoswatter
@Johannes: Ah, I remembered part of it: §1.9/10 requires that local `volatile` objects not be modified except locally. So it's OK to *send* data through a local volatile, but *receiving* any is forbidden. (Indeed, passing a non-const pointer to a local to *any* function is just asking to violate that paragraph!) I guess two threads could attempt to double-lock so long as each lock was split between two locals on either side.
Potatoswatter
@Potatoswatter i'm confused. 1.9/10 reads "An instance of each object with automatic storage duration (3.7.2) is associated with each entry into itsblock. Such an object exists and retains its last-stored value during the execution of the block and while theblock is suspended (by a call of a function or receipt of a signal).". From what do you conclude that such things violate it? Do you say that `void f(int volatile *a) { *a = 0; } int main() { volatile int a = 0; f( }` is invalid? I'm not following.
Johannes Schaub - litb
@Johannes: It depends how you define "last-stored value." By definition, the value something contains is the last stored into it… and vice versa. So, if called functions are allowed to store values into the caller's frame (presumably so), your example is valid (with or without `volatile`). However, it's a bit more of a stretch that other threads, the OS, or external devices should be allowed to modify a object — such would render the paragraph meaningless. (Maybe, with its circular logic, it is.) The point, anyway, is that it applies to all locals regardless of `volatile` qualification.
Potatoswatter
@Potatoswatter oh i see now. I'm not sure how this is resolved, but I heard someone say that for volatiles, each read can potentially be a store (which is why just reading a volatile is listed as a side-effect). I think the word "store" is not defined by C++ (mybe by some of the technical Standards referenced?). C99 has a footnote for that statement which says "In the case of a volatile object, the last store need not be explicit in the program.".
Johannes Schaub - litb
+7  A: 

The observable behavior of a C++ program is determined by read and writes to volatile variables, and any calls to input/output functions.

What this entails is that all reads and writes to volatile variables must happen in the order they appear in code, and they must happen. (If a compiler broke one of those rules, it would be breaking the as-if rule.)

That's all. It's used when you need to indicate that reading or writing a variable is to be seen as an observable effect. (Note, the "C++ and the Perils of Double-Checked Locking" article touches on this quite a bit.)


So to answer the title question, it prevents any optimization that might re-order the evaluation of volatile variables relative to other volatile variables.

That means a compiler that changes:

int x = 2;
volatile y = 5;
x = 5;
y = 7;

To

int x = 5;
volatile y = 5;
y = 7;

Is fine, since the value of x is not part of the observable behavior (it's not volatile). What wouldn't be fine is changing the assignment from 5 to an assignment to 7, because that write of 5 is an observable effect.

GMan
+1: Good answer!
Chubsdad
A: 

usually compiler assumes that a program is single threaded, therefore it has complete knowledge of what's happening with variable values. a smart compiler can then prove that the program can be transformed into another program with equivalent semantics but better performance. for example

x = y+y+y+y+y;

can be transformed to

x = y*5;

however, if a variable can be changed outside the thread, compiler doesn't have a complete knowledge of what's going on by simply examining this piece of code. it can no longer make optimizations like above.

by default, for performance optimization, single thread access is assumed. this assumption is usually true. unless programmer explicitly instruct otherwise with the volatile keyword.

irreputable
Actually, I believe constant folding is still valid on `volatile` items, which is essentially what you've shown here.
Billy ONeal
I'm not c++ expert. in java, the volatile y must be fetched 5 times.
irreputable
@Billy: I agree the answer is a bit unclear, but to clarify: If `y` is volatile, then changing `x = y + y` into `x = 2 * y` is *not* okay. But changing `y = 2 + 2` to `y = 4` is fine.
GMan
@Billy ONeal: If the code reads a volatile variable five times, the variable must be read five times precisely. A statement: "a=(b " may not be written as "a = (volatilevar " even if the latter form would otherwise be faster (since there's no branching), since the latter form reads volatilevar even when b is false.
supercat
The `volatile` keyword was not added to the language to take multithreading into account. Even if the meaning is somewhat similar, the intention is accessing hardware components through memory addresses. `volatile` means that the value of the variable can change outside of the program --not just the thread, but the whole program-- or even that the read can have side effects outside of what the program does --i.e. imagine a hardware counter that increments on each read.
David Rodríguez - dribeas
+1  A: 

Unless you are on an embedded system, or you are writing hardware drivers where memory mapping is used as the means of communication, you should never ever ever be using volatile

Consider:

int main()
{
    volatile int SomeHardwareMemory; //This is a platform specific INT location. 
    for(int idx=0; idx < 56; ++idx)
    {
        printf("%d", SomeHardwareMemory);
    }
}

Has to produce code like:

loadIntoRegister3 56
loadIntoRegister2 "%d"
loopTop:
loadIntoRegister1 <<SOMEHARDWAREMEMORY>
pushRegister2
pushRegister1
call printf
decrementRegister3
ifRegister3LessThan 56 goto loopTop

whereas without volatile it could be:

loadIntoRegister3 56
loadIntoRegister2 "%d"
loadIntoRegister1 <<SOMEHARDWAREMEMORY>
loopTop:
pushRegister2
pushRegister1
call printf
decrementRegister3
ifRegister3LessThan 56 goto loopTop

The assumption about volatile is that the memory location of the variable may be changed. You are forcing the compiler to load the actual value from memory each time the variable is used; and you tell the compiler that reuse of that value in a register is not allowed.

Billy ONeal
+2  A: 

Remember that the "as if rule" means that the compiler can, and should, do whatever it wants, as long as the behaviour as seen from outside the program as a whole is the same. In particular, while a variable conceptually names an area in memory, there is no reason why it actually should be in memory.

It could be in a register:

Its value could be calculated away, e.g. in:

int x = 2;
int y = x + 7;
return y + 1;

Need not have an x and y at all, but could just be replaced with:

return 10;

And another example, is that any code that doesn't affect state from the outside could be removed entirely. E.g. if you zeroise sensitive data, the compiler can see this as a wasted exercise ("why are you writing to what won't be read?") and remove it. volatile can be used to stop that happening.

volatile can be thought of as meaning "the state of this variable must be considered part of the outwardly visible state, and not messed with". Optimisations that would use it other than literally following the source code are not allowed.

(A note C#. A lot I've seen of late on volatile suggests that people are reading about C++ volatile and applying it to C#, and reading about it in C# and applying it to C++. Really though, volatile behaves so differently between the two as to not be useful to consider them related).

Jon Hanna
+1, it is important to think on the memory model and the visible state of the program. The compiler could even discard the `volatile` qualifier if it can determine that the variable will not affect the visible behavior. Consider an auto variable declared volatile in a function that does not call any other function. The compiler can determine that the variable cannot be polled outside of the thread and may decide to apply any optimization it wishes.
David Rodríguez - dribeas
@David : Reads from and writes to volatiles are part of the visible behavior of a C++ program, _by definition_. The optimizer works under the "as-if" rule which allows transformations if they leave the visible behavior unchanged. Therefore, optimizations may not remove reads and writes of volatile objects.
MSalters
@MSalters, David is right, if an automatic variable is volatile but doesn't have its address passed to a non-automatic volatile pointer or potentially become involved in a long-jump or otherwise accessed outside of "normal" functional access, then there is no way for it to be observed from the outside, as there is no way for anything outside to know what to observe. In this case it could be decided that it isn't really volatile, and its volatility ignored.
Jon Hanna
@MSalters: I have failed to locate the quote, but I am quite sure that it was a remark from Herb Sutter in the last few months, mentioning that a compiler that could prove that a volatile variable that could be proved not to be visible from another context --i.e. the address is in the stack and it can be proven that it is not passed to other functions and as such cannot be queried from other contexts-- could be proven not to be part of the visible behavior of the program and a conforming compiler could discard the `volatile` qualifier.
David Rodríguez - dribeas
@MSalters:... but without a proper quote, take this with a pinch of salt, as I might have misinterpreted the comment in the beginning.
David Rodríguez - dribeas
A: 

In addition to the other answers, a volatile may also trigger compiler specific behavior. For example, on Visual Studio, the compiler will prevent reordering of memory accesses to volatiles and other global variables.

Praetorian
That is pretty much required, the order of writes to `volatile` variables must be performed in the same order they appear in code, as they are part of the perceivable state. Consider `x = 5; y = 1;` with both `x` and `y` being volatile. The program as defined has three states, first with whatever values `x`, and `y` had, then a different state with `x=5` and then one with `x==5, y==1`. Consider that `y==1` can be an command to send the value of `x` over the wire (or light the led identified by `x`), if the assignments were reordered the behavior of the program would change.
David Rodríguez - dribeas
+1  A: 

One way to think about a volatile variable is to imagine that it's a virtual property; writes and even reads may do things compiler can't know about. The actual generated code for a writing/reading a volatile variable is simply a memory write or read(*), but the compiler has to regard the code as opaque; it can't make any assumptions under which it might be superfluous. The issue isn't merely with making sure that the compiled code notices that something has caused a variable to change. On some systems, even memory reads can "do" things.

(*) On some compilers, volatile variables may be added to, subtracted from, incremented, decremented, etc. as distinct operations. It's probably useful for a compiler to compile:

  volatilevar++;

as

  inc [_volatilevar]

since the latter form may be atomic on many microprocessors (though not on modern multi-core PCs). It's important to note, however, that if the statement were:

  volatilevar2 = (volatilevar1++);

the correct code would not be:

  mov ax,[_volatilevar1] ; Reads it once
  inc [_volatilevar]     ; Reads it again (oops)
  mov [_volatilevar2],ax

nor

  mov ax,[_volatilevar1]
  mov [_volatilevar2],ax ; Writes in wrong sequence
  inc ax
  mov [_volatilevar1],ax

but rather

  mov ax,[_volatilevar1]
  mov bx,ax
  inc ax
  mov [_volatilevar1],ax
  mov [_volatilevar2],bx

Writing the source code differently would allow the generation of more efficient (and possibly safer) code. If 'volatilevar1' didn't mind being read twice and 'volatilevar2' didn't mind being written before volatilevar1, then splitting the statement into

  volatilevar2 = volatilevar1;
  volatilevar1++;

would allow for faster, and possibly safer, code.

supercat
Sorry, but I can't find a justification for your order claim about `volatilevar2 = (volatilevar1++);`. There is only a single sequence point, at the `;`. Therefore the order in which the writes happens is not guaranteed.
MSalters
@MSalters: You may be right on that point, in which case the third variation would be acceptable. On the other hand, the fastest version, which uses inc [_volatilevar1], is certainly not acceptable despite the fact that there are cases where it would be less trouble-prone than the longer versions.
supercat