views:

451

answers:

11

I am using Visual Studio 6 with some old time code written in c. I found an issue where the code looks like this..

int x = 3;
float y = 3.0;

if(x == y){
   do some crazy stuff
}

is this a valid comparison? is it possible at run time the allocation for the float is 3.0000001 and this would fail?

I am a rookie so take it easy on me :)

+10  A: 

This is generally (i.e., always) a bad idea. As you suspected, the comparison from 3 to 3.0000001 will indeed fail.

What most people do, if an int-float comparison is really necessary, is pick some threshold of tolerance and go with that, like so:

int x = 3;
float y = 3.0;

// some code here

float difference = (float) x - y;
float tolerableDifference = 0.001;

if ((-tolerableDifference <= difference) && (difference <= tolerableDifference)) {
    // more code
}
Dan Tao
I'd use fabs for shorter but probably slower code:if (fabs(x-y) < tolerableDifference) {// do stuff}
bh213
or: if (fabs(difference) < tolerableDifference) {
Nosredna
Nosredna
Yeah, fabs is fine. But jshollax says he's a rookie; fabs is really more for advanced C++ users.
Dan Tao
That was a joke, by the way.
Dan Tao
Yes. Save fabs for next year, maybe. :-)
Nosredna
Actually, if the difference is "tolerable," the comparison should be <=.
Nosredna
Good point -- edited.
Dan Tao
+1  A: 

The crux of the problem is that floating point numbers which have a finite representation in base 10, decimal don't always have a finite representation in base 2, binary.

IRBMe
But 3.0 does, right?
Nosredna
A: 

Edit:

The right way is to use the epsilon method:

#include <math.h>
int x = 3;
int y = 3.0;
if (fabs((float) x - y) < 0.0001) { // Adjust the epsilon
  // Do stuff
}
Sinan Taifour
This will still fail if y == 3.0000001 because then you will be comparing 3.0 to 3.0000001.
Dan Tao
that won't help. Floating point error will make that check problematic. == is generally a bad idea for floats.
Herms
+1 for your revision.
Dan Tao
Casting the int to a float explicitly will do absolutely nothing. The int will be promoted to a float for purposes of comparison anyway.
David Thornley
A: 

If the code looks literally like what you posted (with no intervening computations), then it comes to a question of whether 3.0 and (float)3 (since the integer is automatically converted to a float) are the same. I think they are guaranteed to be the same in this case, because 3 is exactly representable as a float.

Aside: And even if the integer is not exactly representable as a float (i.e. if it is really big), I would imagine that in most implementations, x.0 and (float)x would be the same because, how would the compiler generate x.0 in the first place, if not to do something just like (float)x? However, I guess this is not guaranteed by the standard.

newacct
The size of the mantissa of a float may be different from the number of bits used to represent the integer. So, yeah, large numbers could fail. I agree that he probably won't have a problem with small integers. It's just dangerous to get in the habit of comparing floats and ints. Especially once some arithmetic has happened. Is 3.1f - 0.1f represented the same as 3.0f?
Nosredna
A: 

That's scary. (I wonder what else you'll find.)

x will be promoted to float, but that's not going to help you. Because of how floats are represented, using == to compare them is unreliable.

I might suggest something like this (checking for absolute error/difference) instead:

#define EPSILON 0.0001 
if (fabs((float)x - y) < EPSILON) { /* Do stuff. */ }

which is a common approach and may be sufficient for your purposes, if your values of x and y are "nice". If you really want to go in depth into the topic of comparing floats, this article probably has more information than you want. It does say about the epsilon method:

If the range of the expectedResult is known then checking for absolute error is simple and effective. Just make sure that your absolute error value is larger than the minimum representable difference for the range and type of float you’re dealing with.

JeffH
+1  A: 

No, there is no problem in your use case, because the integers are mapped exactly to floats (there is no decimal truncation problem, as for example with 0.3; but 3 is 1.1E10 in binary scientific notation).

In the worst case scenario I can think of, there can be integer numbers that cannot be represented in float because there are "gaps" larger than 1 between two consecutive float numbers, but even in that case, when the integer is cast to float to do the comparison, it will be truncated to the nearest float, in the same way as the float literal did.

So as long your floats come from non decimal literals, the comparison with the equivalent integer will be the same, because the integer will be cast to the very same float before the comparison can be done.

fortran
From the Standard, 4.9, if an integer can't be represented exactly, the result will be the next lower or next higher representable value, and that will be implementation-defined. The rule for floating-point literals is the same, although differently phrased (2.13.3). I haven't found anything that says they have to be the same rule, so while I'd think the float literals will come out the same as the float conversion, it doesn't look like the standard requires it.
David Thornley
very interesting point!
fortran
+2  A: 

I am going to buck the trend here a bit. As to the first question about whether the comparison is valid, the answer is yes. It is perfectly valid. If you want to know if a floating point value is exactly equal to 3, then the comparison to an integer is fine. The integer is implicitly converted to a floating point value for the comparison. In fact, the following code (at least with the compiler I used) produced identical assembly instructions.

if ( 3 == f )
    printf( "equal\n" );

and

if ( 3.0 == f )
    printf( "equal\n" );

So it depends on the logic and what the intended goal is. There is nothing inherently wrong with the syntax.

Mark Wilkins
A: 

Well, I guess you won't be too surpised to hear that comparing floats for equality is a rookie mistake then.

The problem is that many increments smaller than an integer values can't actually be represented exactly in IEEE floating point. So if you arrive at the float by trying to "index" it up to the value of 3.0 (say in increments of 0.1), it is quite possible your equality comparison can never be true.

It is also a bad idea just from a type-strength standpoint. You should either convert the float into an int, check for your int being "close enough" (eg < 3.1 and > 2.9 or somesuch), or better yet if you are trying to make that float do double-duty for something like a counter, avoid the whole idea.

T.E.D.
Woah! Most integer values? I assume you mean those above 9 quadrillion. Reasonably-sized integers (like 3) should all slide quite nicely and accurately into an IEEE float. The problem starts _to the right_ of the decimal point. 0.1 doesn't have a finite representation in binary.
Nosredna
Thanks for your response.. this is >15 year old code and I am trying to find the root cause to why a function gets called within an if statement like i posted. I noticed this scenario where the code compares a int type and float type and thought something was not right... We had an argument internally with a buddy if this could fail. He owes me lunch.
On a 64-bit CPU, most fixint values are above 9 quadrillion. :-)
Ken
@Nosredna: Quite right. It had been so long since I did this, I remembered roughly what the issue was, but not exactly. Fixed it, I hope.
T.E.D.
Any 32-bit integer can be exactly represented in an IEEE floating-point type. This is obviously not true for float in particular, but integer 3 can be exactly represented in every floating-point representation I've ever seen.
David Thornley
+1  A: 

For your specific example, "do some crazy stuff" will execute. 3.0 will not be 3.0000001 at run-time.

The other answers are more for general cases, but even a hardcoded epsilon is not the greatest idea in the world. A dynamic epsilon based on the actual numbers involved is much better since the more positive and more negative the numbers are, the less likely the hardcoded epsilon will be relevant.

Jim Buck
+4  A: 

No one else has cited it yet, and I haven't linked to it in a while, so here is the classic paper on the scary edges of floating point representation and arithmetic: What Every Computer Scientist Should Know About Floating Point.

The paper is a challenging read for a non-mathematician, but the key points are well stated in between the heavy swaths of math backing them up.

For this discussion, the points made by the other answers here are all valid. Floating point arithmetic is inexact, and hence comparisons for exact equality are generally a bad idea. Hence, epsilon is your friend.

One exception to the exact comparison rule is a test for exactly zero. It is perfectly legal and often sensible to test for exactly zero before a division or logarithm since the answer is well defined for any non-zero value. Of course, in the presence of IEEE rules and NaN, you can let that slide and test for NaN or Inf later on.

RBerteig
+1  A: 

You may be interested in the Game Developers Conference lecture Numerical Robustness for Geometric Calculations (aka EPSILON is NOT 0.00001!). It details choosing good threshold / epsilon values for a variety of tasks.

(+1 on the mention of "What Every Computer Scientist Should Know About Floating Point" in another answer, too.)

leander