tags:

views:

79

answers:

2

I am converting a number of low-level operations from native matlab code into C/mex code, with great speedups. (These low-level operations can be done vectorized in .m code, but I think I get memory hits b/c of large data. whatever.) I have noticed that compiling the mex code with different CFLAGS can cause mild improvements. For example CFLAGS = -O3 -ffast-math does indeed give some speedups, at the cost of mild numerical inaccuracy.

My question: what are the "best" CFLAGS to use, without incurring too many other side effects? It seems that, at the very least that CFLAGS = -O3 -fno-math-errno -fno-unsafe-math-optimizations -fno-trapping-math -fno-signaling-nans are all OK. I'm not sure about -funroll-loops.

also, how would you optimize the set of CFLAGS used, semi-automatically, without going nuts?

A: 

If you know the target CPU...or are at least willing to guarantee a "minimum" CPU...you should definitely look into -mcpu and -march.

The performance gain can be significant.

Shmoopty
I like this in principle, but have not been able to properly test it yet. will do..
shabbychef
+1  A: 

Whatever ATLAS uses on your machine (http://math-atlas.sourceforge.net/) is probably a good starting point. I don't know that ATLAS automatically optimizes specific compiler flags, but the developers have probably spent a fair amount of time doing so by hand.

thouis