tags:

views:

1450

answers:

4

What is the difference between different optimization levels in GCC? Assuming I don't care to have any debug hooks, why wouldn't I just use the highest level of optimization available to me? does a higher level of optimization necessarily (i.e. provably) generate a faster program?

+6  A: 

Generally optimization levels higher than -O2 (just -O3 for gcc but other compilers have higher ones) include optimizations that can increase the size of your code. This includes things like loop unrolling, lots of inlining, padding for alignment regardless of size, etc. Other compilers offer vectorization and inter-procedural optimization at levels higher than -O3, as well as certain optimizations that can improve speed a lot at the cost of correctness (e.g., using faster, less accurate math routines). Check the docs before you use these things.

As for performance, it's a tradeoff. In general, compiler designers try to tune these things so that they don't decrease the performance of your code, so -O3 will usually help (at least in my experience) but your mileage may vary. It's not always the case that really aggressive size-altering optimizations will improve performance (e.g. really aggressive inlining can get you cache pollution).

tgamblin
+1  A: 

I found a web page containing some information about the different optimization levels. One thing a remember hearing somewhere is that optimization might actually break your program and that can be an issue. But I'm not sure how much of a an issue that is any longer. Perhaps todays compilers are smart enough to handle those problems.

Jonas
One trick is to write an infinite loop that actually breaks on integer overflow, and try various levels of optimisation. On my machine a 32-bit integer overflow took 10 seconds at -O0, 2 seconds at -O1 and was optimised to an infinite loop at -O2.
Ted Percival
+4  A: 

Yes, a higher level can sometimes mean a better performing program. However, it can cause problems depending on your code. For example, branch prediction (enabled in -O1 and up) can break poorly written multi threading programs by causing a race condition. Optimization will actually decide something that's better than what you wrote, which in some cases might not work.

And sometimes, the higher optimizations (-O3) add no reasonable benefit but a lot of extra size. Your own testing can determine if this size tradeoff makes a reasonable performance gain for your system.

As a final note, the GNU project compiles all of their programs at -O2 by default, and -O2 is fairly common elsewhere.

Martin W
+1  A: 
Mihai Limbășan