views:

56

answers:

1

Python 3.1

I am doing some calculations on a data that has missing values. Any calculation that involves a missing value should result in a missing value.

I am thinking of using float('nan') to represent missing values. Is it safe? At the end I'll just check

def is_missing(x):
  return x!=x # I hope it's safe to use to check for NaN

It seems perfect, but I couldn't find a clear confirmation in the documentation.

I could use None of course, but it would then require that I do every single calculation with try / except TypeError to detect it. I could also use Inf, but I am even less sure about how it works.

EDIT:

@MSN I understand using NaN is slow. But if my choice is either:

# missing value represented by NaN
def f(a, b, c):
  return a + b * c

or

# missing value represented by None
def f(a, b, c):
  if a is None or b is None or c is None:
    return None
  else:
    return a + b * c

I would imagine the NaN option is still faster, isn't it?

+1  A: 

It's safe-ish, but if the FPU ever has to touch x it can be insanely slow (as some hardware treats NaN as a special case): http://stackoverflow.com/questions/1036686/is-it-a-good-idea-to-use-ieee754-floating-point-nan-for-values-which-are-not-set

MSN
I guess it won't be any faster if I instead check every single expression for missing values using `if`?
max
@max, Can you give me a code example of what you mean? I don't quite understand the question.
MSN