views:

162

answers:

5

Hello,

With these two questions as background (first and second), I got curious about how much optimization a C++ compiler can perform when dealing with pointers? More specifically, I'm interested in how smart a compiler is when it optimizes away code which it may detect will never be run.

(Some may point out that this is a dupe of this question, which it may be, but this particular part of it was not entirely answered. So therefore I decided to start a new question which deals only with this concern.)

(I'm not a C++ expert, so I may be wrong in the following statements, but I'll give it a shot anyway). A C++ compiler may optimize away portions of code which it will recognize never to be executed or never exited (such as loops). Here is an example:


void test() {
    bool escape = false;

    while ( !escape ); // Will never be exited

    // Do something useful after having escaped
}

The compiler would most likely recognize that the loop will never be exited as the code never changes the value of escape so that the loop is exited. This makes the loop useless.

Now, if we were to change the variable to a pointer, would the compiler still optimize away the loop? Say the code looks like so:


void test( bool* escape ) {
    while ( *escape ); // Will this be executed?

    // Do something useful after having escaped
}

My suspicion is that the compiler will do away with the loop, or else the keyword volatile would be redundant, yes?. But what about when working with threads -- where it is in fact modified but outside the function, and maybe even outside that C++ file entirely -- would the compiler still remove the loop? Does it make a difference if the variable pointed to by escape is a global variable, or a local variable inside another function? Can the compiler make this detection? In this question, some say that the compiler will not optimize the loop if library functions are invoked inside the loop. What mechanisms are then in place when using library functions which prevent this optimization?

+8  A: 

In the first case (while ( !escape );) the compiler will treat that as label: goto label; and omit everything after it (and probably give you a warning).

In the second case (while ( *escape );), the compiler has no way of knowing if *escape will be true or false when run, so it has to do the comaprision and loop or not. Note however, it only has to read the value from *escape once, i.e., it could treat that as:

 bool b = *escape;
 label:   if (b) goto label;

volatile will force it to read the value from '*escape' each time through the loop.

James Curran
In the second case, *escape could change if another thread were to modify the same memory. So shouldn't the compiler calculate the value each time?
Jimmy
@Jimmy If `*escape` can be modified from another thread, then it is incorrect to access it without some form of synchronization (in part, for this reason). If `*escape` can be modified by external hardware events (e.g. mapped memory), then it should be `volatile`-qualified to ensure that the compiler reads anew it every time.
Tyler McHenry
@Jimmy: No, *Escape cannot be modified by another thread --- By not marking it `volatile`, you've promised that another thread will not change it. The keyword is there for a purpose. Note that if it had been written `while(*escape) DoSomething();` then it would be reloaded, even without `volatile`, because *escape could be modified within DoSomething()
James Curran
is the volatile keyword inherited? I.e. if I have the following:`void test(bool* foo) { while(*foo); }` and`void test2(bool* bar) { for(;*foo;); }`how is this handled? If I call these from multiple threads passing a `volatile bool*`, does the compiler know to handle it properly? More importantly, what if these two methods are in a library compiled separately from the calling code? Sorry for being pedantic, but I'm curious how far this responsibility is passed to the programmer...
Jimmy
+1  A: 

Remember that the C++ compiler doesn't know, or give two shits, about your threads. Volatile is all you've got. It's perfectly legal for any compiler to make optimizations that destroy multi-threaded code but run fine on single-threaded. When discussing compiler optimizations, ditch threading, it's just not in the picture.

Now, library functions. Of course, your function could have *escape changed at any time by any of those functions, as the compiler has no way to know how they work. This is especially true if you pass the function as a callback. If, however, you have a library where you have the source code, the compiler may dig in and discover that *escape is never changed within.

Of course, if the loop is empty, it will almost certainly just let your program hang, unless it can determine that the condition isn't true when it starts. Removing an empty infinite loop is not the job of the compiler, it's the job of the programmer's brain cells.

DeadMG
+2  A: 

The generic problem with a question like this is that they often include a very unrealistic code snippet. A "what will the compiler do" question needs real code. Since compilers were designed and optimized to compile real code. Most compilers will completely eliminate the function, and the function call, since the code has no side-effects. Leaving us with a question that doesn't have a useful answer.

But, sure, you are leaning towards finding a use for the volatile keyword. You can find many threads on SO talking about why volatile isn't appropriate in a multi-threaded app.

Hans Passant
What would your answer be now that I've made the example code "more realistic"?
gablin
Yes, your comment will be optimized away :)
Hans Passant
Touché. How about now?
gablin
Well, the loops won't be optimized away anymore. The compiler has to consider pointer aliasing for the 2nd snippet so won't make drastic conclusion about whether or not the pointed-to bool can change state as a result of the doStuff() call. Unless it gets inlined so it could know. That's a heavy implementation detail of the code optimizer.
Hans Passant
I rolled back the changes since it destroyed the entire purpose of the question. But thanks for answering that anyway. ^^ I think I need to know more about optimizations, as one answer here only leads to two or three new questions.
gablin
+2  A: 

There's a difference in what the compiler is allowed to do, and what real compilers do.

The Standard covers this in "1.9 Program execution". The standard describes a sort of abstract machine, and the implementation is required to come up with the same "observable behavior". (This is the "as-if" rule, as documented in a footnote.)

From 1.9(6): "The observable behavior of the abstract machine is its sequence of reads and writes to volatile data and calls to library I/O functions." This means that, if you can show that a modification to a function will cause changes to neither, either in that function or after it's called, the modification is legal.

Technically, this means that if you write a function that will run forever testing (say) Goldbach's Conjecture that all even numbers greater than 2 are the sum of two primes, and only stopping if it finds one that isn't, a sufficiently ingenious compiler could substitute either an output statement or an infinite loop, depending on whether the Conjecture is false or true (or unprovable in the Goedel sense). In practice, it will be a while, if ever, before compilers have theorem provers better than the best mathematicians.

David Thornley
+1  A: 
dwelch