views:

580

answers:

10

Is it a good idea to use IEEE754 floating point NaN (not-a-number) for values which are undefined for non-mathematical reasons?

In our case they are not yet set because the values have not been received from some other device. The context is an embedded system using IEC1131 REAL32 values. Edit: The programming language is C, so we would most likely use NAN and isnanf(x), which are from C99. Though we may need some extra contortions to get these into our OS compatibility layer.

The default in programming languages seems to be to initialize floating point variables with positive zero, whose internal representation is all zeros. That is not usable for us, because 0 is in the range of valid values.

It seems like a clean solution to use NaN, but maybe it is more hassle than it is worth and we should pick some other value?

+1  A: 

Can you use NULL?

glasnt
Yes, if the numbers were represented by objects. Primitive types can't be null, at least I'm not aware of such environments.
Joonas Pulakka
Null is zero which is zero which is a valid number they need something invalid.
dwelch
+2  A: 

I have used NaNs in similar situations just because of that: the usual default initialization value 0 is also a valid value. NaNs work fine so far.

It's a good question, by the way, why the default initialization value is usually (for instance, in Java primitive types) 0 and not NaN. Couldn't it as well be 42 or whatever? I wonder what's the rationale of zeros.

Joonas Pulakka
I think the rationale for using 0 is that memory is initialized with zero bytes regardless of the type, for example in the BSS segment of C.
starblue
Yep, probably it's something like that. But now that the language / compiler designers have made the effort to initialize memory, wouldn't it be almost as easy to initialize to any arbitrary value (other than zero)? Zeros are just bits among others :-)
Joonas Pulakka
@mad-j: you want to init all memory with the same bit pattern. So it couldn't be 42, because then you'd usually have to do something different for two adjacent shorts than what you do for an int. This leaves 0 and -1. But 0xffffffff isn't -1 as a float, so you'd have an inconsistency there. There's not much in it, but I think 0 is probably best. Also some hardware can efficiently 0 entire blocks of physical memory at once, for what that's worth.
Steve Jessop
A: 

If your basic need is to have a floating point value which doesn't represent any number which could possibly have been received from the device, and if the device guarantees it will never return NaN, then it seems reasonable to me.

Just remember that depending on your environment, you probably need a special way of detecting NaNs (don't just use if (x == float.NaN) or whatever your equivalent is.)

Jon Skeet
Don't believe this answer. All Jon Skeet has to do is think about the value and it will define itself.
Windows programmer
The value is defined before Skeet things of a variable name, right?
glasnt
+2  A: 

NaNs are a reasonable choice for a 'no value' sentential (the D programming language uses them for uninitialized values, for instance), but because any comparisons involving them will be false, you can get a few surprises:

  • if (result == DEFAULT_VALUE), won't work as expected if DEFAULT_VALUE is NaN, as Jon mentioned.

  • They can also cause problems with range checking if you're not careful. Consider the function:

bool isOutsideRange(double x, double minValue, double maxValue)
{
    return x < minValue || x > maxValue;
}

If x is NaN, this function would incorrectly report that x is between minValue and maxValue.

If you just want a magic value for users to test against, I'd recommend positive or negative infinity instead of NaN, as it doesn't come with the same traps. Use NaN when you want it for its property that any operations on a NaN result in a NaN: it's handy when you don't want to rely on callers checking the value, for example.

[Edit: I initially managed to type "any comparisons involving them will be true" above, which is not what I meant, and is wrong, they're all false, apart from NaN != NaN, which is true]

jskinner
Which language uses these comparison rules? Maybe D does. But at least C and C++ don't work with NaN this way. All the ordering comparisons will be false. x == NaN is false for any x, including NaN.
Igor Krivokon
No, your function only reports that it is not outside the range. It is neither inside nor outside, which may indeed confuse those using floating point numbers naively.
starblue
@Igor: We're saying the same thing here. isOutsideRange would return false if x is NaN, which implies it's inside the range, which it isn't.
jskinner
@jskinner No, it doesn't imply it's inside the range. Essentially NaN is nowhere.
starblue
@starblue: I realise this. 'isOutsideRange' is an example of an ill-defined function in the face of NaN inputs: NaNs are neither inside the range nor outside the range, so returning a bool is inappropriate. It's simply an example of how what looks fine on the surface is in fact incorrect when NaNs are introduced.
jskinner
IEEE needs to add NaB. Comparisons will yield true, false, or NaB. Any definitions of boolean that neglect NaB will be posted on thedailywtf.
Windows programmer
Also "sort" is probably a naive user of floating-point inputs in that if you sort an array of floats, any NaN values may cause even the rest of the values to be incorrectly sorted. E.g. in Python sorted([1,2,3,float('nan'),1,2,3]) returns [1,2,3,nan,1,2,3], and in Clojure (sort [1 2 3 (Float. "NaN") 1 2 3]) returns (1 2 3 NaN 1 2 3).
Jouni K. Seppänen
A: 

My feelings are that it's a bit hacky, but at least every other numbers you make operations with this NaN value gives NaN as result - when you see a NaN in a bug report, at least you know what kind of mistake you are hunting.

Szundi
+2  A: 

Be careful with NaN's... they can spread like wildfire if you are not careful.

They are a perfectly valid value for floats, but any assignments involving them will also equal NaN, so they propagate through your code. This is quite good as a debugging tool if you catch it, however it can also be a real nuisance if you are bringing something to release and there is a fringe case somewhere.

D uses this as rationale for giving floats NaN as default. (Which I'm not sure I agree with.)

Chris Burt-Brown
Err... Isn't it just the point of NaNs that they'll propagate? It's much better to have NaN as result, which indicates that there's something wrong, than to have an innocent-looking but totally incorrect number (which would result from accidental use of zero-initialized numbers).
Joonas Pulakka
Yes and no, because when you spot NaN only by looking at the output or by explicitly checking for NaN. The consequence of that is that errors may get detected much later than they arise. On the other hand, if you use NULLs (if possible), you get a NPE/segmentation fault pretty fast. Brutal, but efficient.
quant_dev
+2  A: 

I think it is a bad idea in general. One thing to keep in mind is that most CPU treat Nan much slower then "usual" float. And it is hard to guarantee you will never have Nan in usual settings. My experience in numerical computing is that it often brings more trouble than it worths.

The right solution is to avoid encoding "absence of value" in the float, but to signal it in another way. That's not always practical, though, depending on your codebase.

David Cournapeau
A: 

This sounds like a good use for nans to me. Wish I had thought of it...

Sure, they are supposed to propagate like a virus, that is the point.

I think I would use nan instead of one of the infinities. It might be nice to use a signaling nan and have it cause an event on the first use, but by then its too late it should go quiet on the first use.

dwelch
A: 

Using NaN as a default value is reasonable.

Note that some expressions, such as (0.0 / 0.0), return NaN.

Joe Erickson
+3  A: 

Just noticed this question.

This is one of the uses of NaNs that the IEEE 754 committee has in mind (I was a committee member). The propagation rules for NaNs in arithmetic make this very attractive, because if you have a result from a long sequence of calculations that involve some initialized data, you will not mistake the result for a valid result. It can also make tracing back through your calculations to find where you are using the initialized data much more straightforward.

That said, there are a few pitfalls that are outside of the 754 committee's control: as others have noted, not all hardware supports NaN values at speed, which can result in performance hazards. Fortunately, one does not often do a lot of operations on initialized data in a performance-critical setting.

Stephen Canon