views:

688

answers:

10

I'm porting my application from 32 bit to 64 bit. Currently, the code compiles under both architectures, but the results are different. For various reasons, I'm using floats instead of doubles. I assume that there is some implicit upconverting from float to double happening on one machine and not the other. Is there a way to control for this, or specific gotchas I should be looking for?

edited to add:

32 bit platform

 gcc (GCC) 4.1.2 20070925 (Red Hat 4.1.2-33)
 Dual-Core AMD Opteron(tm) Processor 2218 HE

64 bit platform

 gcc (Ubuntu 4.3.3-5ubuntu4) 4.3.3
 Intel(R) Xeon(R) CPU

Applying the -mfpmath=387 helps somewhat, after 1 iteration of the algorithm the values are the same, but beyond that they fall out of sync again.

I should also add that my concern isn't that the results aren't identical, it's that porting to a 64 bit platform has uncovered a 32 bit dependency of which I was not aware.

+3  A: 

There is no inherent need for floats and doubles to behave differently between 32-bit and 64-bit code but frequently they do. The answer to your question is going to be platform and compiler specific so you need to say what platform you are porting from and what platform you are porting to.

On intel x86 platforms 32-bit code often uses the x87 co-processor instruction set and floating-point register stack for maximum compatibility whereas on amb64/x86_64 platforms, the SSE* instructions and xmm* registers and are often used instead. These have different precision characteristics.

Post edit:

Given your platform, you might want to consider trying the -mfpmath=387 (the default for i386 gcc) on your x86_64 build to see if this explains the differing results. You may also want to look at the settings for all the -fmath-* compiler switches to ensure that they match what you want in both builds.

Charles Bailey
How do I find which -fmath- switches are on?
drewster
The defaults are all specified in the info pages. Try 'info gcc' then search for '-fmath'.
Charles Bailey
+6  A: 

Your compiler is probably using SSE opcodes to do most of its floating point arithmetic on the 64 bit platform assuming x86-64, whereas for compatibility reasons it probably used the FPU before for a lot of its operations.

SSE opcodes offer a lot more registers and consistency (values always remain 32 bits or 64 bits in size), while the FPU uses 80 bit intermediate values when possible. So you were most likely benefitting from this improved intermediate precision before. (Note the extra precision can cause inconsistent results like x == y but cos(x) != cos (y) depending on how far apart the computations occur!)

You may try to use -mfpmath=387 for your 64 bit version since you are compiling with gcc and see if your results match your 32 bit results to help narrow this down.

Edward Kmett
A: 

The gnu compiler has a lot of compiler options related to floating point numbers that can cause calculations to break under some circumstances. Just search this page for the term "float" and you'll find them.

Brian
+3  A: 

Like others have said, you haven't provided enough information to tell exactly what's going on. But in a general sense, it seems you've been counting on some kind of floating point behavior that you shouldn't be counting on.

99 times out of 100 the problem is that you're comparing two floats for equality somewhere.

If the problem is simply that you're getting slightly different answers, you need to realize that neither one is "correct" -- some sort of rounding is going to be taking place no matter what architecture you're on. It's a matter of understanding the significant digits in your calculations, and being aware that any values you're coming up with are approximations to a certain degree.

Clyde
+1  A: 

The x87 FPU's 80-bit internal registers cause its floating point results to differ slightly from other FPUs that use 64-bit internally (like on x86_64). You will get different results between these processors unless you don't mind taking large performance hits by flushing things out to memory or doing other "strictfp" tricks.

See also: http://stackoverflow.com/questions/644678/floating-point-rounding-when-truncating

And: http://docs.sun.com/source/806-3568/ncg_goldberg.html

Adam Goode
A: 

It's really hard to control a lot of this stuff.

For a start, the C standard often calls for operations to floats to be done in "double-space" and converted back to floats.

Intel processors have 80 bits of precision in the registers they use to many of these operations, and then they drop that to 64 bits when it's stored to main memory. That means that the value of a variable may change for no apparent reason.

You can use things like GnuMP if you really care, and I'm sure there are other libraries that guarantee consistent results. Most of the time the amount of error/jitter generated is below the real-world resolution that you need.

Chris Arguin
A: 

On x64, the SSE2 instruction set is used, while in 32-bit apps, the x87 FPU is often the default.

The latter internally stores all floating-point values in a 80-bit format. The latter uses plain 32-bit IEEE floats.

Apart from that, an important point to make is that you shouldn't rely on your floating-point math being identical across architectures.

Even if you use 32-bit builds on both machines, there's still no guarantee that Intel and AMD will yield identical results. Of course, when one of them runs a 64-bit build, you only add more uncertainty.

Relying on the precise results of a floating-point operation would almost always be a bug.

Enabling SSE2 on the 32-bit version as well would be a good start, but again, don't make assumptions about floating-point code. There is always a loss of precision, and it's a bad idea to assume that this loss is predictable, or that it can be reproduced between CPU's or different builds.

jalf
A: 

The really hard part to get is that both sets of results are correct. It is not fair to characterize the changes as anything but "different." Perhaps there is an increased emotional attachment to the older results...but there is no mathematical reason to prefer the 32 bit results over the 64bit results.

Have you considered a change to use fixed point math for this application? Not only is fixed point math stable across changes of chip, compiler, and libraries, in many cases it is faster than floating point math too.

As a quick test, move the binary from the 32bit system to the 64bit system and run it. Then rebuild the app on the 64bit system as a 32bit binary, and run that. That might help to identify what change(s) are actually producing the divergent behavior.

semiuseless
A: 

As already mentioned, being different should not be a problem, as long as they are both correct. Ideally, you should have unit tests for this kind of things (pure computation usually falls into the relatively easy to test camp).

It is basically impossible to guarantee the same results across CPU and toolchains (one compiler flag can change a lot already), and it is already very hard to be consistent. Designing robust floating point code is a hard task, but fortunately, in many cases, precision is not an issue.

David Cournapeau
A: 

Have you considered a change to use fixed point math for this application?

How can i do this???

erick2red