views:

221

answers:

5

I am having some trouble with IEEE floating point rules preventing compiler optimizations that seem obvious. For example,

char foo(float x) {
    if (x == x) 
       return 1;
    else 
       return 0; 
}

cannot be optimized to just return 1 because NaN == NaN is false. Okay, fine, I guess.

However, I want to write such that the optimizer can actually fix stuff up for me. Are there mathematical identities that hold for all floats? For example, I would be willing to write !(x - x) if it meant the compiler could assume that it held all the time (though that also isn't the case).

I see some reference to such identities on the web, for example here, but I haven't found any organized information, including in a light scan of the IEEE 754 standard.

It'd also be fine if I could get the optimizer to assume isnormal(x) without generating additional code (in gcc or clang).

Clearly I'm not actually going to write (x == x) in my source code, but I have a function that's designed for inlining. The function may be declared as foo(float x, float y), but often x is 0, or y is 0, or x and y are both z, etc. The floats represent onscreen geometric coordinates. These are all cases where if I were coding by hand without use of the function I'd never distinguish between 0 and (x - x), I'd just hand-optimize stupid stuff away. So, I really don't care about the IEEE rules in what the compiler does after inlining my function, and I'd just as soon have the compiler ignore them. Rounding differences are also not very important since we're basically doing onscreen drawing.

I don't think -ffast-math is an option for me, because the function appears in a header file, and it is not appropriate that the .c files that use the function compile with -ffast-math.

+2  A: 

Another reference that might be of some use for you is a really nice article on floating-point optimization in Game Programming Gems volume 2, by Yossarian King. You can read the article here. It discusses the IEEE format in quite detail, taking into account implementations and architecture, and provides many optimization tricks.

Kornel Kisielewicz
Thanks, I'll take a look at that. I'm still interested in this identities question though..
Ken
A: 

What happened when you tried it the obvious way and profiled it? or examined the generated asm?

If the function is inlined with values known at the call site, the optimizer has this information available. For example: foo(0, y).

You may be surprised at the work you don't have to do, but at the very least profiling or looking at what the compiler actually does with the code will give you more information and help you figure out where to proceed next.

That said, if you know certain things that the optimizer can't figure out itself, you can write multiple versions of the function, and specify the one you want to call. This is something of a hassle, but at least with inline functions they will all be specified together in one header. It's also quite a bit easier than the next step, which is using inline asm to do exactly what you want.

Roger Pate
I'm looking at the generated code. In a simplified example I'm looking at that is a single call of the function, the version I'd write by hand is 16 instructions compared to 56 for using the inlined function. I would like to use the function in hot code, and it will typically be invoked multiple times.The point of writing the generic function and having it inlined is that it makes the code easier to read and maintain.
Ken
+2  A: 

Hi

I think that you are always going to struggle to make computer floating-point-number arithmetic behave like mathematical real-number arithmetic, and suggest that you don't for any reason. I suggest that you are making a type error trying to compare the equality of 2 fp numbers. Since fp numbers are, in the overwhelming majority, approximations, you should accept this and use approximate-equality as your test.

Computer integers exist for equality testing of numerical values.

Well, that's what I think, you go ahead and fight the machine (well, all the machines actually) if you wish.

Now, to answer some parts of your question:

-- for every mathematical identity you are familiar with from real-number arithmetic, there are counter examples in the domain of floating-point numbers, whether IEEE or otherwise;

-- 'clever' programming almost always makes it more difficult for a compiler to optimise code than straightforward programming;

-- it seems that you are doing some graphics programming: in the end the coordinates of points in your conceptual space are going to be mapped to pixels on a screen; pixels always have integer coordinates; your translation from conceptual space to screen space defines your approximate-equality function

Regards

Mark

High Performance Mark
I don't think you're following this.. I'm talking about cases where everything is really and truly equal if you take out cases like NaN and Inf. This is a form of generic programming. I would like to replace adhoc transform code with a call to an overly general function but readable function, and let the compiler optimize it back to the same as if I had written a specific version. This is more akin to how the compiler is permitted to optimize assuming signed integer arithmetic doesn't overflow.Basically, you're giving me the beginner floating point speech, and it doesn't apply.
Ken
From my point of view, the issue is that IEEE floating point is amazingly restrictive to the compiler. Really not the tradeoff C usually makes. It's preserving "detail" in the source code that really and truly is not there, because if I hand-instantiated my generic template I would simply remove things like (x - x)*complicatedExpression. I can buy that people doing numerics need it, and that it's a good default, but if there's no way for even a savvy floating point user to communicate that such transformations are legal, then that's a shame. Fortran allowed this.
Ken
+1  A: 

If you can assume that floating-point numbers used in this module will not be Inf/NaN, you can compile it with -ffinite-math-only (in GCC). This may "improve" the codegen for examples like the one you posted.

Stephen Canon
Hey Stephen! No, I probably cannot do that either. My best lead right now is using clang and then putting in things like 'if (isnan(f)) __builtin_unreachable();'. That doesn't work either, but it seems like it __should__, so I can file bugs for it.
Ken
It seems like what you really want here is an assert-as-optimizer-hint, or a `#pragma finite_math`, which might be possible.
Stephen Canon
I think that's what builtin_unreachable is for. It informs the optimizer that some code path cannot happen.
Ken
Yeah, that seems like a reasonable (if not especially beautiful) approach. Note that `isnan` is a macro (defined in `math.h`), so I'm not surprised that the optimizer isn't able to reason about it with `builtin_unreachable`. I would try `if (x != x) builtin_unreachable();`
Stephen Canon
No luck on that version either. rdar://problem/7502262
Ken
+1  A: 

You could compare for bitwise equality. Although you might get bitten for some values that are equivalent but bitwise different, it will catch all those cases where you have a true equality as you mentioned. And I am not sure the compiler will recognize what you do and remove it when inlining (which I believe is what you are after), but that can easily be checked.

Juice
Good point, though that would leave the comparisons and branches in in the cases where the compiler could not prove equality. That's a little bit different from just simplifying the math. I'm reasonably happy with what I'm doing now: asserting isfinite(f) at the top of the function to be inlined then filing compiler bugs that the compiler should now be able to optimize away the math.
Ken