views:

145

answers:

5

Background:

Many years ago, I inherited a codebase that was using the Visual Studio (VC++) flag '/fp:fast' to produce faster code in a particular calculation-heavy library. Unfortunately, '/fp:fast' produced results that were slightly different to the same library under a different compiler (Borland C++). As we needed to produce exactly the same results, I switched to '/fp:precise', which worked fine, and everything has been peachy ever since. However, now I'm compiling the same library with g++ on uBuntu Linux 10.04 and I'm seeing similar behavior, and I wonder if it might have a similar root cause. The numerical results from my g++ build are slightly different from the numerical results from my VC++ build. This brings me to my question:

Question:

Does g++ have equivalent or similar parameters to the 'fp:fast' and 'fp:precise' options in VC++? (and what are they? I want to activate the 'fp:precise' equivalent.)

More Verbose Information:

I compile using 'make', which calls g++. So far as I can tell (the make files are a little cryptic, and weren't written by me) the only parameters added to the g++ call are the "normal" ones (include folders and the files to compile) and -fPIC (I'm not sure what this switch does, I don't see it on the 'man' page).

The only relevant parameters in 'man g++' seem to be for turning optimization options ON. (e.g. -funsafe-math-optimizations). However, I don't think I'm turning anything ON, I just want to turn the relevant optimization OFF.

I've tried Release and Debug builds, VC++ gives the same results for release and debug, and g++ gives the same results for release and debug, but I can't get the g++ version to give the same results as the VC++ version.

+2  A: 

I don't think there's an exact equivalent. You might try -mfpmath=sse instead of the default -mfpmath=387 to see if that helps.

JSBangs
I'll try this solution!
Boinst
That didn't help :( but thanks for the idea!
Boinst
+8  A: 

From the GCC manual:

-ffloat-store Do not store floating point variables in registers, and inhibit other options that might change whether a floating point value is taken from a register or memory.

This option prevents undesirable excess precision on machines such as the 68000 where the floating registers (of the 68881) keep more precision than a double is supposed to have. Similarly for the x86 architecture. For most programs, the excess precision does only good, but a few programs rely on the precise definition of IEEE floating point. Use -ffloat-store for such programs, after modifying them to store all pertinent intermediate computations into variables.

To expand a bit, most of these discrepancies come from the use of the x86 80-bit floating point registers for calculations (vs. the 64-bits used to store double values). If intermediate results are kept in the registers without writing back to memory, you effectively get 16 bits of extra precision in your calculations, making them more precise but possibly divergent from results generated with write/read of intermediate values to memory (or from calculations on architectures that only have 64-bit FP registers).

These flags (both in GCC and MSVC) generally force truncation of each intermediate result to 64-bits, thereby making calculations insensitive to the vagaries of code generation and optimization and platform differences. This consistency generally comes with a slight runtime cost in addition to the cost in terms of accuracy/precision.

Drew Hall
I'll try this right away
Boinst
This didn't turn out to be the problem, but thankyou for your well thought out response. Your link was very useful
Boinst
Sorry to hear it. Maybe it has something to do with rounding modes? You might take a look at the generated assembly for a minimal program for which you get divergent results and see if the FPU is set up differently in MSVC vs. GCC. Then you can try to map those FPU settings to different compiler flags until you find the magic flag.
Drew Hall
+5  A: 

Excess register precision is an issue only on FPU registers, which compilers (with the right enabling switches) tend to avoid anyway. When floating point computations are carried out in SSE registers, the register precision equals the memory one.

In my experience most of the /fp:fast impact (and potential discrepancy) comes from the compiler taking the liberty to perform algebraic transforms. This can be as simple as changing summands order:

( a + b ) + c --> a + ( b + c)

can be - distributing multiplications like a*(b+c) at will, and can get to some rather complex transforms - all intended to reuse previous calculations. In infinite precision such transforms are benign, of course - but in finite precision they actually change the result. As a toy example, try the summand-order-example with a=b=2^(-23), c = 1. MS's Eric Fleegal describes it in much more detail.

In this respect, the gcc switch nearest to /fp:precise is -fno-unsafe-math-optimizations. I think it's on by default - perhaps you can try setting it explicitly and see if it makes a difference. Similarly, you can try explicitly turning off all -ffast-math optimizations: -fno-finite-math-only, -fmath-errno, -ftrapping-math, -fno-rounding-math and -fsignaling-nans (this last option is non default!)

Ofek Shilon
Thanks for the idea! 'no-unsafe-math-optimizations' is set by default, but I did try what you suggested and set it explicitely, but it didn't make a difference.
Boinst
I've "accepted" this answer for now, I think it answers my question even though unfortunately for me it did not solve my problem. :(
Boinst
Sorry to hear that, hope someone else might have a better idea.
Ofek Shilon
+1  A: 

This is definitely not related to optimization flags, assuming by "Debug" you mean "with optimizations off." If g++ gives the same results in debug as in release, that means it's not an optimization-related issue. Debug builds should always store each intermediate result in memory, thereby guaranteeing the same results as /fp:precise does for MSVC.

This likely means there is (a) a compiler bug in one of the compilers, or more likely (b) a math library bug. I would drill into individual functions in your calculation and narrow down where the discrepancy lies. You'll likely find a workaround at that point, and if you do find a bug, I'm sure the relevant team would love to hear about it.

Drew Hoskins
Thanks Drew, I think you are right, and I will do as you suggest
Boinst
A: 

-mpc32 or -mpc64?

But you may need to recompile C and math libraries with the switch to see the difference... This may apply to options others suggested as well.

Tomek