I have an application that was developed for Linux x86 32 bits, there are lots of floating-point operations and a lot of tests depending on the results. Now we are porting it to x86_64 but the test results are different in these architecture. We don't want to keep a separate set of results for each arch.
According to this article the problem is that gcc in X86_64 assumes fpmath=sse while x86 assumes fpmath=387. The 387 FPU uses 80 bit internal precision for all operations and only convert the result to a given floating-point type (float, double or long double) while sse uses the type of the operands to determine its internal precision.
I can force -mfpmath=387 when compiling my own code and all my operations work correctly but whenever I call some library function (sin, cos, atan2, etc.) the results are wrong again. I assume its because libm was compiled without the fpmath override.
I tried to build libm myself (glibc) using 387 emulation but it caused a lot of crashes all around (don't know if I did something wrong).
Is there any way to force all code in a process to use the 387 emulation in x86_64? Or maybe some library that returns the same values as libm does on both archs? Any suggestions?
Regarding the question of "Do you need the 80 bit precision", I have to say that this is not a problem for an individual operation. In this simple case the difference is really small and makes no difference. When compounding a lot of operations, though, the error propagates and the difference in the final result is not so small anymore and makes a difference. So I guess I need the 80 bit precision.