views:

1008

answers:

17

Hi,

Today I had a discussion with a friend of mine and we debated for a couple of hours about "compiler optimization".

I defended the point that sometimes, a compiler optimization might introduce bugs or at least, undesired behavior.

My friend totally disagreed, saying that "compilers are built by smart people and do smart things" and thus, can never go wrong.

He didn't convinced my at all, but I have to admit I lack of real-life examples to strengthen my point.

Who is right here ? If I am, do you have any real-life example where a compiler optimization produced a bug in the resulting software ? If I'm mistaking, should I stop programming and learn fishing instead ?

Thank you !

+1  A: 

It's theoretically possible, sure. But if you don't trust the tools to do what they are supposed to do, why use them? But right away, anyone arguing from the position of

"compilers are built by smart people and do smart things" and thus, can never go wrong.

is making a foolish argument.

So, until you have reason to believe that a compiler is doing so, why posture about it?

Daniel DiPaolo
http://cm.bell-labs.com/who/ken/trust.html
Craig Stuntz
It's more like an intuition. I trust the tool, but I'm sure compiler optimization can **sometimes** do things you wouldn't except. I just can't give an example. Maybe i'm wrong, and there is no such examples.
ereOn
Well there are examples (provided above and in comments), but it seems like a silly argument (especially your friends "supporting argument") until it becomes a real issue. But you are indeed correct, it is absolutely possible.
Daniel DiPaolo
+11  A: 

Compiler optimizations can introduce bugs. That's why you can turn them off.

One example: a compiler can optimize the read/write access to a memory location, doing things like eliminating duplicate reads or duplicate writes, or re-ordering certain operations. If the memory location in question is only used by a single thread and is actually memory, that may be ok. But if the memory location is a hardware device IO register, then re-ordering or eliminating writes may be completely wrong. In this situation you normally have to write code knowing that the compiler might "optimize" it, and thus knowing that the naive approach doesn't work.

Update: As Adam Robinson pointed out in a comment, the scenario I describe above is more of a programming error than an optimizer error. But the point I was trying to illustrate is that some programs, which are otherwise correct, combined with some optimizations, which otherwise work properly, can introduce bugs in the program when they are combined together. In some cases the language specification says "You must do things this way because these kinds of optimizations may occur and your program will fail", in which case it's a bug in the code. But sometimes a compiler has a (usually optional) optimization feature that can generate incorrect code because the compiler is trying too hard to optimize the code or can't detect that the optimization is inappropriate. In this case the programmer must know when it is safe to turn on the optimization in question.

Mr. Shiny and New
A more common reason to turn off optimizations is that debugging is usually harder with them on.
Craig Stuntz
Non-volatile reads is a great example of *misbehavior* caused by compiler (or runtime) optimization, though I'm not sure if that's the sort of thing that one would classify as a "bug", since it's the duty of the developer to account for such things.
Adam Robinson
@Adam Robinson: Agree completely. I've expanded on this point in my answer.
Carl Smotricz
Thank you for your answer. I love the "That's why you can turn them off" => Just unobjectionable ! ;)
ereOn
@Adam Robinson: You're right. Updated my answer.
Mr. Shiny and New
The example given is why the "volatile" keyword exists. It's a code error. A bug-free compiler will not introduce bugs even with optimization.
phkahler
Compiler optimization can bring existing bugs to the surface. Tghey (usually) *introduce* them - unless there's a compiler bug.
peterchen
@Mr. Shiny has given his reason for believing why optimization can be turned off, but this is far from _indisputable_ and I doubt it's explained as such in most compiler documentation. When debugging people typically want machine code that can be easily related to the original code and so would work with builds where optimization is turned off, while QA builds (and the final release) would have optimization turned on. Anyone who, on discovering an issue like the one described by @Mr. Shiny, then "fixes" it by turning optimization off is fixing the wrong thing in my opinion.
George Hawkins
Disagree. Optmizations may bring bugs in your code to the surface, and the optimizer itself may be buggy, but "optimization can *introduce* bugs" is factually wrong and also puts the blame on the wrong target.
peterchen
+4  A: 

Is it likely? Not in a major product, but it's certainly possible. Compiler optimizations are generated code; no matter where code comes from (you write it or something generates it), it can contain errors.

Adam Robinson
+8  A: 

Just one example: a few days ago, someone discovered that gcc 4.5 with the option -foptimize-sibling-calls (which is implied by -O2) produces an Emacs executable that segfaults on startup.

This has apparently been fixed since.

legoscia
Apparently this was a bug in the compiler. Of course compilers can (do!) have bugs: they're pieces of software. I'm more interested in examples of bugs (undesired/unexpected behavior) in the optimized code that are not caused by compiler bugs.
Martinho Fernandes
+6  A: 

Compiler (and runtime) optimization can certainly introduce undesired behaviour - but it at least should only happen if you're relying on unspecified behaviour (or indeed making incorrect assumptions about well-specified behaviour).

Now beyond that, of course compilers can have bugs in them. Some of those may be around optimisations, and the implications could be very subtle - indeed they're likely to be, as obvious bugs are more likely to be fixed.

Assuming you include JITs as compilers, I've seen bugs in released versions of both the .NET JIT and the Hotspot JVM (I don't have details at the moment, unfortunately) which were reproducible in particularly odd situations. Whether they were due to particular optimisations or not, I don't know.

Jon Skeet
@Downvoter: Care to give a reason? What was incorrect in my answer?
Jon Skeet
Jealousy, probably.
Callum Rogers
I pointed out a well-known issue in a post below where C++'s optimizer will introduce a bug using the double-checked locking pattern. This is correct behavior as far as the current C++ spec is concerned, it is not a compiler bug, it is well-specified behavior that works perfectly fine with optimizations turned off, but breaks when it is turned on.
tloach
@Jon Skeet: I never downvote (well, hardly ever), but at 172k you even notice? :-)
Mike Dunlavey
@Mike: Yes, I notice... because it's not about the points, but about the fact that someone disagrees enough to downvote. That gives me cause for concern - have I got something wrong? Is my answer misleading? I want to make sure that any errors are corrected.
Jon Skeet
@Jon: My theory of downvotes is that's how you know you've said something interesting.
Mike Dunlavey
+5  A: 

I've never heard of or used a compiler whose directives could not alter the behaviour of a program. Generally this is a good thing, but it does require you to read the manual.

AND I had a recent situation where a compiler directive 'removed' a bug. Of course, the bug is really still there but I have a temporary workaround until I fix the program properly.

High Performance Mark
++ Especially in Fortran. We're always running into situations where the Fortran works fine, but only if you use a particular opt level. And running Fortran under an IDE? where the optimizer (even if you didn't ask for it) has felt at liberty to totally scramble the code and shuffle the variables? all in the name of "Optimization"? Gimme a *brreak*!
Mike Dunlavey
@Mike -- I share your pain, my recent situation is a Fortran program I'm porting from one cluster to another 'identical' one. IDE ? surely you mean Emacs :-)
High Performance Mark
Sorry I'm stuck in the Windows world. Fortran is a hot potato - MS, DEC, Compaq, now Intel. Did I miss any foster parents? It works under .net, but only if your boss keeps paying for upgrades. Plus GCC, where we have lovely GDB. I was once told, like it or not "Fortran is like Rock 'n Roll, it will Never Die!"
Mike Dunlavey
+2  A: 

Yes. A good example is the double-checked locking pattern. In C++ there is no way to safely implement double-checked locking because the compiler can re-order instructions in ways that make sense in a single-threaded system but not in a multi-threaded one. A full discussion can be found at http://www.aristeia.com/Papers/DDJ_Jul_Aug_2004_revised.pdf

tloach
As you seem to indicate, that is a shortcoming of C++ not a problem with optimizing compilers. It's a problem that exists even without optimization.
phkahler
Not at all. With no optimization double-checked locking is fine. It's a painful type of bug to find because it will work fine in a debug build. It's only because the optimizer is allowed to make logical changes that are equivalent in a single-threaded system that there is a problem. You could claim that it is an issue with what is allowed in C++ by the optimizer, but it is still an area where compiling with optimization turned on can cause issues.
tloach
+4  A: 

To combine the other posts:

  1. Compilers do occasionally have bugs in their code, like most software. The "smart people" argument is completely irrelevant to this, as NASA satellites and other apps built by smart people also have bugs. The coding that does optimization is different coding from that which doesn't, so if the bug happens to be in the optimizer then indeed your optimized code may contain errors while your non-optimized code will not.

  2. As Mr. Shiny and New pointed out, it's possible for code that is naive with regard to concurrency and/or timing issues to run satisfactorily without optimization yet fail with optimization as this may change the timing of execution. You could blame such a problem on the source code, but if it will only manifest when optimized, some people might blame optimization.

Carl Smotricz
+2  A: 

I encountered this a few times with a newer compiler building old code. The old code would work but relied on undefined behavior in some cases, like improperly defined / cast operator overload. It would work in VS2003 or VS2005 debug build, but in release it would crash.

Opening up the assembly generated it was clear that the compiler had just removed 80% of the functionality of the function in question. Rewriting the code to not use undefined behavior cleared it up.

More obvious example: VS2008 vs GCC

Declared: Function foo( const type & tp );

Called: foo( foo2() );

where foo2() returns an object of class 'type';

Tends to crash in GCC because the object isn't allocated on the stack in this case, but VS does some optimization to get around this and it will probably work.

peter karasev
+2  A: 

It can happen. It has even affected Linux.

Jean Azzopardi
+12  A: 

Yes absolutely.
See here, here (which still exists - "by design"!?!), here, here, here, here...

BlueRaja - Danny Pflughoeft
+2  A: 

Yes, compiler optimizations can be dangerous. Usually hard real-time software projects forbids optimizations for this very reason. Anyway, do you know of any software with no bugs?

Aggressive optimizations may cache or even do strange assumptions with your variables. The problem is not only with the stability of your code, but also they can fool your debugger. I have seen several times a debugger failing to represent the memory contents because some optimizations retained a variable value within the registers of the micro

The very same thing can happen to your code. The optimization puts a variable into a register and do not write to the variable until it has finished. Now imagine how different things can be if your code has pointers to variables in your stack and it has several threads

Francisco Garcia
I have written a few bug-free programs in the past. "Hello world" and such. :P
Martinho Fernandes
I write safety-critical hard real-time software, and I *always* build with max optimization. If there is a compiler bug I want to find it early, not wait until we're out of CPU time AND calendar time and have someone say "-O3" can help, and then ship buggy code because we didn't get enough testing. If you don't trust your tools, don't use them.
phkahler
@Martinho Fernandes: Yeap, there's even a saying: Is it possible to write a completely bug-free program? Yes, but it will be useless.
sharptooth
@phkahler I have seen DO-178b certified projects, some of those DAL-A level where the design board forbids any type of compiler optimization. Even threads are usually forbidden. I have made some research on compilers and met many people who agrees that C++ and Java languages have very serious flaws in their design (hence you dont seem them in DAL-A software), although people try to live with it or ignore them. I have very few tools I trust and most of the stress of my work is because I am forced to use buggy libraries and/or tools
Francisco Garcia
@Francisco - We use C in automotive stuff, not C++ or Java. Mass produced products are not subject to DO-178b whatever that is. We are subject to lawyers which is probably worse. Agreed that threads are bad. Certified compilers are not generally available for our targets, and I have seen GCC used to build code for airbag controllers with much success. I've seen compiler bugs, processor bugs, and hardware design issues. Building code at O3 is the least of my worries.
phkahler
+1  A: 

I certainly agree that it's silly to say the because compilers are written by "smart people" that they are therefore infallible. Smart people designed the Hindenberg and the Tacoma Narrows Bridge, too. Even if it's true that compiler-writers are among the smartest programmers out there, it's also true that compilers are among the most complex programs out there. Of course they have bugs.

On the other hand, experience tells us that the reliability of commercial compilers is very high. I've had many many times that someone told me that the reason why is program doesn't work MUST be because of a bug in the compiler because he has checked it very carefully and he is sure that it is 100% correct ... and then we find that in fact the program has an error and not the compiler. I'm trying to think of times that I've personally run across something that I was truly sure was an error in the compiler, and I can only recall one example.

So in general: Trust your compiler. But are they ever wrong? Sure.

Jay
A: 

Everything that you can possibly imagine doing with or to a program will introduce bugs.

Jay
Aww, you're going to down vote me for a joke? Lighten up!
Jay
+3  A: 

Aliasing can cause problems with certain optimizations, which is why compilers have an option to disable those optimizations. From Wikipedia:

To enable such optimizations in a predictable manner, the ISO standard for the C programming language (including its newer C99 edition) specifies that it is illegal (with some exceptions) for pointers of different types to reference the same memory location. This rule, known as "strict aliasing", allows impressive increases in performance[citation needed], but has been known to break some otherwise valid code. Several software projects intentionally violate this portion of the C99 standard. For example, Python 2.x did so to implement reference counting,[1] and required changes to the basic object structs in Python 3 to enable this optimisation. The Linux kernel does this because strict aliasing causes problems with optimization of inlined code.[2] In such cases, when compiled with gcc, the option -fno-strict-aliasing is invoked to prevent unwanted or invalid optimizations that could produce incorrect code.

Mark Ransom
+1  A: 

As I recall, early Delphi 1 had a bug where the results of Min and Max were reversed. There was also an obscure bug with some floating point values only when the floating point value was used within a dll. Admittedly, it has been more than a decade, so my memory may be a bit fuzzy.

Mike Chess
+1  A: 

When a bug goes away by disbaling optimizations, most of the time it's still your fault

I am responsible for a commercial app, written mostly in C++ - started with VC5, ported to VC6 early, now successfully ported to VC2008. It grew to over 1 Million lines in the last 10 years.

In that time I could confirm a single code generation bug thast occured when agressive optimizations where enabled.

So why am I complaining? Because in the same time, there were dozens of bugs that made me doubt the compiler - but it turned out to be my insufficient understanding of the C++ standard. The standard makes room for optimizations the compiler may or may not make use of.

Over the years on different forums, I've seen many posts blaming the compiler, ultimately turning out to be bugs in the original code. No doubt many of them obscure bugs that need a detailed understanding of concepts used in the standard, but source code bugs nonetheless.

Why I reply so late: stop blaming the compiler before you have confirmed it's actually the compiler's fault.

peterchen