views:

416

answers:

6

Hi,

I have two questions:

(1) I learned somewhere that -O3 is not recommended with gcc, because

The -O3 optimization level may increase the speed of the resulting executable, but can also increase its size. Under some circumstances where these optimizations are not favorable, this option might actually make a program slower. in fact it should not be used system-wide with gcc 4.x. The behavior of gcc has changed significantly since version 3.x. In 3.x, -O3 has been shown to lead to marginally faster execution times over -O2, but this is no longer the case with gcc 4.x. Compiling all your packages with -O3 will result in larger binaries that require more memory, and will significantly increase the odds of compilation failure or unexpected program behavior (including errors). The downsides outweigh the benefits; remember the principle of diminishing returns. Using -O3 is not recommended for gcc 4.x.

Suppose I have a workstation (Kubuntu9.04) which has 128GB of memory and 24 cores and is shared by many users, some of whom may run intensive programs using like 60GB memory. Is -O2 a better choice for me than -O3?

(2) Also I also learned that when a running program crashes unexpectedly, any debugging information is better than none--so the use of -g is recommended for optimized programs, both for development and deployment. But when compiled with -ggdb3 together with -O2 or -O3, will it slow down the speed of execution? Assume I am still using the same workstation.

Thanks and regards!

+2  A: 

Try it
You can rarely make accurate judgments about speed and optimisation without any data.

ps. This will also tell you if it's worth the effort. How many milliseconds saved in a function used once at startup is worthwhile ?

Martin Beckett
+1  A: 

-g and/or -ggdb just adds debugging symbols to the executable. It makes the executable file bigger, but that part isn't loaded into memory(except when run in a debugger or similar).

As for what's best for performance of -O2 and -O3, there's no silver bullet. You have to measure/profile it for your particular program.

nos
+1  A: 
  1. The only way to know for sure is to benchmark your application compiled with -O2 and -O3. Also there are some individual optimization options that -O3 includes and you can turn on and off individually. Concerning the warning about larger binaries, note that just comparing executable file sizes compiled with -O2 and -O3 will not do much good here, because it is the size of small critical internal loops that matters here the most. You really have to benchmark.

  2. It will result in a larger executable, but there shouldn't be any measurable slowdown.

Laurynas Biveinis
A: 

I think this pretty much answers your question:

The downsides outweigh the benefits; remember the principle of diminishing returns. Using -O3 is not recommended for gcc 4.x.

If the guys writing the compiler say not to do it, I wouldn't second guess them.

Eric Petroelje
The guys writing the compiler are in inenviable position of having to accomodate everybody at the same time. The overriding advice from the same guys is "try it and see what works for you best"
Laurynas Biveinis
+1  A: 

Firstly, it does appear that the compiler team is essentially admitting that -O3 isn't reliable. It seems like they are saying: try -O3 on your critical loops or critical modules, or your Lattice QCD program, but it's not reliable enough for building the whole system or library.

Secondly, the problem with making the code bigger (inline functions and other things) isn't only that it uses more memory. Even if you have extra RAM, it can slow you down. This is because the faster the CPU chip gets, the more it hurts to have to go out to DRAM. They are saying that some programs will run faster WITH the extra routine calls and unexploded branches (or whatever O3 replaces with bigger things) because without O3 they will still fit in the cache, and that's a bigger win than the O3 transformations.

On the other issue, I wouldn't normally build anything with -g unless I was currently working on it.

DigitalRoss
+1  A: 

Hi Tim, In my experiance what I found is that GCC does not generate best assembly with O2 and O3, The best way is to apply specific optimization flags which you can found from this will definately generate better code than -O2 and -O3 because there are flags which you can not find in -O2 and -O3 and they will be useful for your faster code. One good example is that code and data prefetch instruction will never be inserted in your code with -O2 and -O3, But using additional flags for prefetching will make your memory intensive code 2 to 3 % fast.

You can find list of GCC Optimization flags here.

http://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html

Thanks,

Sunny

Sunny