views:

398

answers:

6

I have a software project in which I sometimes get strange results from small, simple floating point operations. I assume there is something I have missed, and would like some tips about how to debug the following problems:

(the compiler used is MS VC 6.0, that is version 12 of the Microsoft C compiler)

First anomaly:

extern double Time, TimeStamp, TimeStep;  // History terms, updated elsewhere
void timer_evaluation_function( ) {
    if ( ( Time - TimeStamp ) >= TimeStep ) {  
        TimeStamp += TimeStep;  
        timer_controlled_code( );  
    }
{....}

For some reason, the timer evaluation failed and the timed code never executed. In the debugger, there was no problem to see that the trig condition were in fact true but the FPU refused to find a positive result. The following code segment had no problems although it performed the same operations. The problem was sidestepped by inserting a bogus evaluation which could be allowed to fail.

I'm guessing the FPU state is somehow tainted by earlier operations performed, and that there are some compiler flags that would help?

Second anomaly:

double K, Kp = 1.0, Ti = 0.02;
void timed_code( ){
    K = ( Kp * ( float ) 2000 ) / ( ( float ) 2000 - 2.0F * Ti * 1e6 )
{....}

The result is #IND, even though the debugger evaluates the equation to approx 0.05. The #IND value appears in the FPU stack when the 2.0F value is loaded into the FPU from using the fld instruction. The previous instruction loads the integer value 2000 as a double float using the fild instruction. Once the FPU stack contains the #IND value all is lost, but once again the debugger has no problem evaluating the formula. Later on, these operations return the expected results.

Also, once again the FPU problems occur directly after the function call. Should I insert floating point operations that clears the FPU state after each new function? Is there a compiler flag that could affect the FPU in some way?

I'm grateful of any and all tips and tricks at this point.

EDIT: I've managed to avoid the problem by calling the assembly function EMMS first thing in the top function. That way the FPU is cleared of any MMX related garbage that may or may not have been created in the environment my code is called from. It seems that the state of the FPU is not something to take for granted.

//Frank

A: 

While I am not providing you with an exact solution, I suggest you start by reading this article that describes the different optimizations that one can use.

David Segonds
+2  A: 

No idea what the problem could be, but on x86, the FINIT instructions clears the FPU. To test your theory, you can insert this somewhere in your code:

__asm {
    finit
}
avalys
A: 

re: timestamps--

What are you getting your source of timestamps from? Something sounds suspicious. Try logging them to a file.

Jason S
That's the best advice here so far. Your odds of having uncovered a compiler bug are smaller than the odds of feeding bad input into your function. I wouldn't necessarily trust the debugger, especially if you're debugging optimized code. My first instinct would be to check all my inputs.
Ori Pessach
obviously the most obvious inputs have been checked - as I stated: the debugger have no problems evaluating the equations. You are however correct in suspecting the inputs; in this case the FPU state. The compiler was never a suspect.
Frank Johansson
+1  A: 

It's not really an answer to your question, but you might want to look at two of Raymond Chen's articles regarding strange FPU behaviour. Having read your question and re-read the articles, I don't immediately see a link - but if the code you've pasted isn't complete or if the articles give you an idea about some surrounding behaviour which caused the issue... specifically, if you're loading a DLL anywhere nearby.

Uninitialized floating point variables can be deadly

How did the invalid floating point operand exception get raised when I disabled it?

Jon Bright
A: 

If the bad value is loaded by a fld that should load 2.0, I'd check the memory where this value is loaded from - it might just be a compiler/linker problem.

jpalecek
+1  A: 

If you're using the windows QueryPerformanceCounter and QueryPerformanceFrequency functions on a system that supports MMX try inserting the femms instruction after querying the frequency/counter and before the computation.

__asm femms

I've encountered trouble from these function before where they were doing 64 bit computation using MMX and not clearing the floating point flags/state.

This situation could also happen if there is any 64 bit arithmetic between the floating point operations.

Daemin