ansaurus

Question

How do you deal with missing data using numpy/scipy?

Answer 1

+1 A:

I also question the problem with masked arrays. Here are a couple of examples:

import numpy as np
data = np.ma.masked_array(np.arange(10))
data[5] = np.ma.masked # Mask a specific value

data[data>6] = np.ma.masked # Mask any value greater than 6

# Same thing done at initialization time
init_data = np.arange(10)
data = np.ma.masked_array(init_data, mask=(init_data > 6))

DopplerShift 2009-09-04 17:08:30

Answer 2

+1 A:

Masked arrays are the anwswer, as DpplerShift describes. For quick and dirty use, you can use fancy indexing with boolean arrays:

>>> import numpy as np
>>> data = np.arange(10)
>>> valid_idx = data % 2 == 0 #pretend that even elements are missing

>>> # Get non-missing data
>>> data[valid_idx]
array([0, 2, 4, 6, 8])

You can now use valid_idx as a quick mask on other data as well

>>> comparison = np.arange(10) + 10
>>> comparison[valid_idx]
array([10, 12, 14, 16, 18])

Barry Wark 2009-09-04 17:22:16

ansaurus

tags:

views:

answers:

How do you deal with missing data using numpy/scipy?

related questions