views:

322

answers:

8

Is there a method to automatically find the best compiler options (on a given machine), which result in the fastest possible executable?

Naturally, I use g++ -O3, but there are additional flags that may make the code run faster, e.g. -ffast-math and others, some of which are hardware-dependent.

Does anyone know some code I can put in my configure.ac file (GNU autotools), so that the flags will be added to the Makefile automatically by the ./configure command?

In addition to automatically determining the best flags, I would be interested in some useful compiler flags that are good to use as a default for most optimized executables.

Update: Most people suggest to just try different flags and select the best ones empirically. For that method, I'd have a follow-up question: Is there a utility that lists all compiler flags that are possible for the machine I'm running on (e.g. tests if SSE instructions are available etc.)?

A: 

Is there a method to automatically find the best compiler options (on a given machine), which result in the fastest possible executable?

No.

You could compile your program with a large assortment of compiler options, then benchmark each and every version, then select the one that is "fastest," but that's hardly reliable and probably not useful for your program.

greyfade
Which, BTW, is precisely what Acovea (mentioned by @ergosys) does: compile and benchmark the program hundreds, even thousands of times (which is why the program has to be simple and the benchmarks short) with different combinations of GCC optimization flags and "evolve" a good set of flags using a genetic algorithm.
Jörg W Mittag
+2  A: 

some compilers provide "-fast" option to automatically select most aggressive optimization for given compilation host. http://en.wikipedia.org/wiki/Intel_C%2B%2B_Compiler

Unfortunately, g++ does not provide similar flags.

as a follow-up to your next question, for g++ you can use -mtune option together with -O3 which will give you reasonably fast defaults. Challenge then is to find processor type of your compilation host. you may want to look on autoconf macro archive, to see somebody wrote necessary tests. otherwise, assuming linux, you have to parse /proc/cpuinfo to get processor type

aaa
+4  A: 

I don't think you can do this at configure-time, but there is at least one program which attempts to optimize gcc option flags given a particular executable and machine. See http://www.coyotegulch.com/products/acovea/ for example.

You might be able to use this with some knowledge of your target machine(s) to find a good set of options for your code.

ergosys
Ditto for ATLAS (Automatically Tuned Linear Algebra Software), an implementation of BLAS/LAPACK. See http://math-atlas.sourceforge.net/
celion
+4  A: 

Um - yes. This is possible. Look into profile-guided optimization.

James D
+1  A: 

After some googling, I found this script: gcccpuopt.

On one of my machines (32bit), it outputs:

-march=pentium4 -mfpmath=sse

On another machine (64bit) it outputs:

$ ./gcccpuopt 
Warning: The optimum *32 bit* architecture is reported
-m32 -march=core2 -mfpmath=sse

So, it's not perfect, but might be helpful.

+2  A: 

See also -mcpu=native/-mtune=native gcc options.

wRAR
Cool, I'll try that. That is new in GCC 4.2, so I'll have to update ...
A: 

This is a solution that works for me, but it does take a little while to set up. In "Python Scripting for Computational Science" by Hans Petter Langtangen (an excellent book in my opinion), an example is given of using a short python script to do numerical experiments to determine the best compiler options for your C/Fortran/... program. This is described in Chapter 1.1.11 on "Nested Heterogeneous Data Structures".

Source code for examples from the book are freely available at http://folk.uio.no/hpl/scripting/index.html (I'm not sure of the license, so will not reproduce any code here), and in particular you can find code for a similar numerical test in the code in TCSE3-3rd-examples.tar.gz in the file src/app/wavesim2D/F77/compile.py , which you could use as a base for writing a script which is appropriate for a particular system/language (C++ in your case).

Nathan
A: 

Optimizing your app is mainly your job, not the compiler's.

Here's an example of what I'm talking about.

Once you've done that, IF your app is compute-bound, with hotspots in your code (not in library code) THEN the compiler optimizations for speed will make some difference, so you can try different flag combinations.

Mike Dunlavey