views:

345

answers:

9

Why do comparisons of NaN values behave differently from all other values? That is, all comparisons with the operators ==, <=, >=, <, > where one or both values is NaN returns false, contrary to the behaviour of all other values.

I suppose this simplifies numerical computations in some way, but I couldn't find an explicitly stated reason, not even in the Lecture Notes on the Status of IEEE 754 by Kahan which discusses other design decisions in detail.

This deviant behavior is causing trouble when doing simple data processing. For example, when sorting a list of records w.r.t. some real-valued field in a C program I need to write extra code to handle NaN as the maximal element, otherwise the sort algorithm could become confused.

Edit: The answers so far all argue that it is meaningless to compare NaNs.

I agree, but that doesn't mean that the correct answer is false, rather it would be a Not-a-Boolean (NaB), which fortunately doesn't exist.

So the choice of returning true or false for comparisons is in my view arbitrary, and for general data processing it would be advantageous if it obeyed the usual laws (reflexivity of ==, trichotomy of <, ==, >), lest data structures which rely on these laws become confused.

So I'm asking for some concrete advantage of breaking these laws, not just philosophical reasoning.

Edit 2: I think I understand now why making NaN maximal would be a bad idea, it would mess up the computation of upper limits.

NaN != NaN might be desirable to avoid detecting convergence in a loop such as

while (x != oldX) {
    oldX = x;
    x = better_approximation(x);
}

which however should better be written by comparing the absolute difference with a small limit. So IMHO this is a relatively weak argument for breaking reflexivity at NaN.

A: 

You can't compare apples with philosophy. You can't compare a number with something which is not a number.

mouviciel
+1  A: 

I'm guessing that NaN (Not A Number) means exactly that: This is not a number and thus comparing it does not really make sense.

It's a bit like arithmetic in SQL with null operands: They all result in null.

The comparisons for floating point numbers compare numeric values. Thus, they can't be used for non numeric values. NaN therefore cannot be compared in a numeric sense.

Daren Thomas
"This is not a number and thus comparing it does not really make sense." Strings are not numbers but comparing them makes sense.
Jason
yes, comparing a string to a string makes sense. But comparing a string to, say, apples, does not make much sense. Since apples and pears are not numbers, does it make sense to compare them? Which is greater?
Daren Thomas
+6  A: 

NaN can be thought of as an undefined state/number. similar to the concept of 0/0 being undefined or sqrt(-3) (in the real number system where the floating point lives).

NaN is used as a sort of placeholder for this undefined state. Mathematically speaking, undefined is not equal to undefined. Neither can you say an undefined value is greater or less than another undefined value. Therefore all comparisons return false.

This behaviour is also adventageous in the cases where you compare sqrt(-3) to sqrt(-2). They would both return NaN but they are not equivilant even though they return the same value. Therefore having equality always returing false when dealing with NaN is the desired behaviour.

Chris
+3  A: 

To throw in yet another analogy. If I hand you two boxes, and tell you that neither of them contains an apple, would you tell me that the boxes contain the same thing?

NaN contains no information about what something is, just what it isn't. Therefore these elements can never definitely be said to be equal.

Jack Ryan
All empty sets are equal, by definition.
MSalters
The boxes you are given are NOT known to be empty.
John Smith
+4  A: 

From the wikipedia article on NaN, the following practices may cause NaNs:

  • All mathematical operations> with a NaN as at least one operand
  • The divisions 0/0, ∞/∞, ∞/-∞, -∞/∞, and -∞/-∞
  • The multiplications 0×∞ and 0×-∞
  • The additions ∞ + (-∞), (-∞) + ∞ and equivalent subtractions.
  • Applying a function to arguments outside its domain, including taking the square root of a negative number, taking the logarithm of a negative number, taking the tangent of an odd multiple of 90 degrees (or π/2 radians), or taking the inverse sine or cosine of a number which is less than -1 or greater than +1.

Since there is no way to know which of these operations created the NaN, there is no way to compare them that makes sense.

Stefan Rusek
Moreover, even if you knew which operation, it wouldn't help. I can construct any number of formulas which go to 0/0 at some point, which have (if we assume continuity) well-defined and different values at that point.
David Thornley
+1  A: 

It only looks peculiar because most programming environments that allow NaNs do not also allow 3-valued logic. If you throw 3-valued logic into the mix, it becomes consistent:

  • (2.7 == 2.7) = true
  • (2.7 == 2.6) = false
  • (2.7 == NaN) = unknown
  • (NaN == NaN) = unknown

Even .NET does not provide a bool? operator==(double v1, double v2) operator, so you are still stuck with the silly (NaN == NaN) = false result.

Christian Hayter
+1  A: 

I don't know the design rationale, but here's an excerpt from the IEEE 754-1985 standard:

"It shall be possible to compare floating-point numbers in all supported formats, even if the operands' formats differ. Comparisons are exact and never overflow nor underflow. Four mutually exclusive relations are possible: less than, equal, greater than, and unordered. The last case arises when at least one operand is NaN. Every NaN shall compare unordered with everything, including itself."

Rick Regan
+20  A: 

I was a member of the IEEE-754 committee, I'll try to help clarify things a bit.

First off, floating-point numbers are not real numbers, and floating-point arithmetic does not satisfy the axioms of real arithmetic. Trichotomy is not the only property of real arithmetic that does not hold for floats, nor even the most important. For example:

  • Addition is not associative.
  • The distributive law does not hold.
  • There are floating-point numbers without inverses.

I could go on. It is not possible to specify a fixed-size arithmetic type that satisfies all of the properties of real arithmetic that we know and love. The 754 committee has to decide to bend or break some of them. This is guided by some pretty simple principles:

  1. When we can, we match the behavior of real arithmetic.
  2. When we can't, we try to make the violations as predictable and as easy to diagnose as possible.

Regarding your comment "that doesn't mean that the correct answer is false", this is wrong. The predicate (y < x) asks whether y is less than x. If y is NaN, then it is not less than any floating-point value x, so the answer is necessarily false.

I mentioned that trichotomy does not hold for floating-point values. However, there is a similar property that does hold. Clause 5.11, paragraph 2 of the 754-2008 standard:

Four mutually exclusive relations are possible: less than, equal, greater than, and unordered. The last case arises when at least one operand is NaN. Every NaN shall compare unordered with everything, including itself.

As far as writing extra code to handle NaNs goes, it is usually possible (though not always easy) to structure your code in such a way that NaNs fall through properly, but this is not always the case. When it isn't, some extra code may be necessary, but that's a small price to pay for the convenience that algebraic closure brought to floating-point arithmetic.

Stephen Canon
This is an outstandingly insightful post straight from the horse's mouth. Thank you.
Jason
+1  A: 

The over-simplified answer is that a NaN has no numeric value, so there is nothing in it to compare to anything else.

You might consider testing for and replacing your NaNs with +INF if you want them to act like +INF.

Loadmaster