views:

273

answers:

2

Trying to port java code to C++ I've stumbled over some weird behaviour. I can't get double addition to work (even though compiler option /fp:strict which means "correct" floating point math is set in Visual Studio 2008).

double a = 0.4;
/* a: 0.40000000000000002, correct */

double b = 0.0 + 0.4;
/* b: 0.40000000596046448, incorrect
(0 + 0.4 is the same). It's not even close to correct. */

double c = 0;  
float f = 0.4f;  
c += f;
/* c: 0.40000000596046448 too */

In a different test project I set up it works fine (/fp:strict behaves according to IEEE754).

Using Visual Studio 2008 (standard) with No optimization and FP: strict.

Any ideas? Is it really truncating to floats? This project really needs same behaviour on both java and C++ side. I got all values by reading from debug window in VC++.

Solution: _fpreset(); // Barry Kelly's idea solved it. A library was setting the FP precision to low.

+7  A: 

The only thing I can think of is perhaps you are linking against a library or DLL which has modified the CPU precision via the control word.

Have you tried calling _fpreset() from float.h before the problematic computation?

Barry Kelly
It must be something along those lines, but is 0.0 + 0.4 even a computation? Can't it be evaluated at compile time? Checking the disassembly might establish whether the runtime float mode has anything to do with it, or whether something has gone wrong at compile time.
Steve Jessop
Sure it can be, but if it were that simple, it would be easy to reproduce, no?
Barry Kelly
I dunno, maybe something else in the project is specifying /fp:stupid or equivalent. My personal favourite would be a source file isn't newline-terminated and therefore the program has undefined behaviour, although I don't hold out much hope of ever seeing that cause a bug in the wild...
Steve Jessop
Spot on! It was a library setting the FP precision to 32bit (a directx DLL in my case). Thanks! :)
+3  A: 

Yes, it's certainly truncating to floats. I get the same value printing float f = 0.4 as you do in the "inaccurate" case. Try:

double b = 0.0 + (double) 0.4;

The question then is why it's truncating to floats. There's no excuse in the standard for treating 0.0 + 0.4 as a single-precision expression, since floating point literals are double-precision unless they have a suffix to say otherwise.

So something must be interfering with your settings, but I have no idea what.

Steve Jessop