views:

86

answers:

2

Floating point values are inexact, which is why we should rarely use strict numerical equality in comparisons. For example, in Java this prints false (as seen on ideone.com):

System.out.println(.1 + .2 == .3);
// false

Usually the correct way to compare results of floating point calculations is to see if the absolute difference against some expected value is less than some tolerated epsilon.

System.out.println(Math.abs(.1 + .2 - .3) < .00000000000001);
// true

The question is about whether or not some operations can yield exact result. We know that for any non-finite floating point value x (i.e. either NaN or an infinity), x - x is ALWAYS NaN.

But if x is finite, is any of this guaranteed?

  1. x * -1 == -x
  2. x - x == 0

(In particular I'm most interested in Java behavior, but discussions for other languages are also welcome.)


For what it's worth, I think (and I may be wrong here) the answer is YES! I think it boils down to whether or not for any finite IEEE-754 floating point value, its additive inverse is always computable exactly. Since e.g. float and double has one dedicated bit just for the sign, this seems to be the case, since it only needs flipping of the sign bit to find the additive inverse (i.e. the significand should be left intact).

Related questions

+2  A: 

Both equalities are guaranteed with IEEE 754 floating-point, because the results of both x-x and x * -1 are representable exactly as floating-point numbers of the same precision as x. In this case, regardless of the rounding mode, the exact values have to be returned by a compliant implementation.

EDIT: Comparing to the .1 + .2 example.

You can't add .1 and .2 in IEEE 754 because you can't represent them to pass to +. Addition, subtraction, multiplication, division and square root return the unique floating-point value which, depending on the rounding mode, is immediately below, immediately above, nearest with a rule to handle ties, ..., the result of the operation on the same arguments in R. Consequently, when the result (in R) happens to be representable as a floating-point number, this number is automatically the result regardless of the rounding mode.

The fact that your compiler lets you write 0.1 as shorthand for a different, representable number without a warning is orthogonal to the definition of these operations. When you write - (0.1) for instance, the - is exact: it returns exactly the opposite of its argument. On the other hand, its argument is not 0.1, but the double that your compiler uses in its place.

In short, another part of the reason why the operation x * (-1) is exact is that -1 can be represented as a double.

Pascal Cuoq
Part of the reason is that subtraction and multiplication, along with addition, division and square root, are among the basic operations for which "optimal" results are mandated by IEEE 754.
Pascal Cuoq
@Pascal: The definition of "optimal" needs clarification, then, because you can argue that perhaps it's "optimal" for `.1 + .2 == .3`, but I think this is `false` in all IEEE-754 `double` implementation (unless I'm missing on some other issues).
polygenelubricants
+2  A: 

Although x - x may give you -0 rather than true 0, -0 compares as equal to 0, so you will be safe with your assumption that any finite number minus itself will compare equal to zero.

See http://stackoverflow.com/questions/2686644/is-there-a-floating-point-value-of-x-for-which-x-x-0-is-false for more details.

Gabe