views:

14557

answers:

11

I'm developing a performance critical application for Intel Atom processor.

What are the best gcc optimization flags for this CPU?

+17  A: 

There is a cool framework called Acovea (Analysis of Compiler Options via Evolutionary Algorithm), by Scott Rober Ladd, one of the GCC hackers. It's a genetic/evolutionary algorithm framework that tries to optimize GCC optimization flags for a specific piece of code via natural selection.

It works something like this: you write a little piece of benchmark code (it really has to be little, because it will be re-compiled and executed several thousand times) that represents the performance characteristics of the larger program you want to optimize. Then Acovea randomly constructs some dozens of different GCC commandlines and compiles and runs your benchmark with each of them. The best of these commandlines are then allowed to "mate" and "breed" new "children" which (hopefully) inherit the best "genes" from their "parents". This process is repeated for a couple dozen "generations", until a stable set of commandline flags emerges.

Jörg W Mittag
+1  A: 

I don't know if GCC has any Atom-specific optimization flags yet, but the Atom core is supposed to be very similar to the original Pentium, with the very significant addition of the MMX/SSE/SSE2/SSE3/SSSE3 instruction sets. Of course, these only make a significant difference if your code is floating-point or DSP-heavy.

Perhaps you could try:

gcc -O2 -march=pentium -mmmx -msse -msse2 -msse3 -mssse3 -mfpmath=sse

Dan
While the Atom is comparable to the Pentium in that it is an in-order architecture, the pipeline structure is very different, and scheduling the instructions for the Pentium would probably be quite bad for performance.
Agreed, you do *not* want to be using -march=pentium for anything other than a real Pentium.
kquinn
+2  A: 

Just like for Pentium 4: -march=prescott -O2 -pipe -fomit-frame-pointer

+3  A: 

Well, the Gentoo wiki states for the prescott:

http://en.gentoo-wiki.com/wiki/Safe_Cflags/Intel#Atom_N270

CHOST="i686-pc-linux-gnu"

CFLAGS="-march=prescott -O2 -pipe -fomit-frame-pointer"

CXXFLAGS="${CFLAGS}"

Not any longer: http://en.gentoo-wiki.com/wiki/Safe_Cflags/Intel#Atom_N270.2FN280 (updated link) now recommends CFLAGS="-O2 -march=core2 -mtune=generic -mssse3 -mfpmath=sse -fomit-frame-pointer -pipe"
Marcel Korpel
...and CFLAGS="-O2 -march=atom -mssse3 -mfpmath=sse -fomit-frame-pointer -pipe" for GCC 4.5
Marcel Korpel
+8  A: 

I've a script that auto selects the appropriate flags for your CPU and compiler combination. I've just updated it to support Intel Atom:

http://www.pixelbeat.org/scripts/gcccpuopt

Update: I previously specified -march=prescott for Atom, but looking more into it shows that Atom is merom ISA compliant, therefore -march=core2 is more appropriate. Note however that Atoms are in-order cores, the last of those being the original pentium. Therefore it's probably better to -mtune=pentium as well. Unfortunately I don't have an Atom to test. I would really appreciate if anyone could benchmark the diff between:

-march=core2 -mfpmath=sse -O3
-march=core2 -mtune=pentium -mfpmath=sse -O3

Update: Here are a couple of nice articles on low level optimization for Atom:

pixelbeat
I don't think setting both `-march=core2` and `-mtune=pentium` works at all: I get `arg.c:1: error: CPU you selected does not support x86-64 instruction set`
orlandu63
Interesting. Does your atom support 64 bit? If you try the above script it will probably tell you to also add -m32
pixelbeat
@pixelbeat yes, my atom supports 64-bit.
orlandu63
Well the more interesting answer would be if -m32 suppresses the error message, and whether gcccpuopt outputs -m32 for your cpu
pixelbeat
executing `gcc44 -march=core2 -mtune=pentium -m32 -o lol lol.c` on a minimal c file exits with an error about not being able to find -lgcc. and `gcccpuopt` tells me `-m32 -march=core2 -mtune=pentium -mfpmath=sse` is the optimal configuration.
orlandu63
A: 

here's some cross-pollenation of blogs... what i was really hoping for was a firefox-compiled-for-atom benchmark...

Address : http :// ivoras.sharanet.org/blog/tree/2009-02-11.optimizing-for-atom.html

"As it turns out, gcc appears to do a very decent job with -mtune=native, and mtune=generic is more than acceptable. The biggest gains (in this math-heavy benchmark) come from using SSE for math, but even they are destroyed by tuning for pentium4.

"The difference between the fastest and the slowest optimization is 21%. The impact of using march instead of mtune is negligible (not enough difference to tell if it helps or not).

"(I've included k6 just for reference - I know Atom doesn't have 3dnow)

"Late update: Tuning for k8 (with SSE and O3) yields a slightly higher best score of 182."

+3  A: 

From Intel, Getting Started with MID

When using GCC to compile, there are a few recommended flags to use:

  • -O2 or -O1: O2 flag optimizes for speed, while the -O1 flag optimizes for size
  • -msse3
  • -march=core2
  • -mfpmath=sse
Marc
A: 

i686 is closest. Don't go for core2.

GCC 4.1 -O3 -march=i686 GCC 4.3 -O3 -march=native

GCC 4.1 -O4 -ffast-math GCC 4.3 -O4 -ffast-math

http://macles.blogspot.com/2008/09/intel-cc-compiler-gcc-and-intel-atom.html

brij
+7  A: 

GCC 4.5 will contain the -march=atom and -mtune=atom options.

Source: http://gcc.gnu.org/gcc-4.5/changes.html

When using GCC 4.5 you'll also want to use -fexcess-precision=fast (see http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42376 for details)
Marcel Korpel
A: 

What about Intel C compiler (icc) ? At least on the benchmarks that come with it, domination over gcc is quite noticeable...

peter karasev