views:

254

answers:

1

Hi all. I am new to Python and Numpy, and I am facing a problem, that I can not modify a numpy.recarray, when applying to masked views. I read recarray from a file, then create two masked views, then try to modify the values in for loop. Here is an example code.

import numpy as np
import matplotlib.mlab as mlab


dat = mlab.csv2rec(args[0], delimiter=' ')
m_Obsr = dat.is_observed == 1
m_ZeroScale = dat[m_Obsr].scale_mean < 0.01


for d in dat[m_Obsr][m_ZeroScale]:
    d.scale_mean = 1.0

But when I print the result

newFile = args[0] + ".no-zero-scale"

mlab.rec2csv(dat[m_Obsr][m_ZeroScale], newFile, delimiter=' ')

All the scale_means in the files, are still zero.

I must be doing something wrong. Is there a proper way of modifying values of the view? Is it because I am applying two views one by one?

Thank you.

+3  A: 

I think you have a misconception in this term "masked views" and should (re-)read The Book (now freely downloadable) to clarify your understanding.

I quote from section 3.4.2:

Advanced selection is triggered when the selection object, obj, is a non-tuple sequence object, an ndarray (of data type integer or bool), or a tuple with at least one sequence object or ndarray (of data type integer or bool). There are two types of advanced indexing: integer and Boolean. Advanced selection always returns a copy of the data (contrast with basic slicing that returns a view).

What you're doing here is advanced selection (of the Boolean kind) so you're getting a copy and never binding it anywhere -- you make your changes on the copy and then just let it go away, then write a new fresh copy from the original.

Once you understand the issue the solution should be simple: make your copy once, make your changes on that copy, and write that same copy. I.e.:

dat = mlab.csv2rec(args[0], delimiter=' ')
m_Obsr = dat.is_observed == 1
m_ZeroScale = dat[m_Obsr].scale_mean < 0.01
the_copy = dat[m_Obsr][m_ZeroScale]

for d in the_copy:
    d.scale_mean = 1.0

newFile = args[0] + ".no-zero-scale"
mlab.rec2csv(the_copy, newFile, delimiter=' ')
Alex Martelli
Yes, you were right I did not realise I am doing an advanced selection and that the later returns a temporary copy. And my understanding of "masked views" is indeed very hazy.Thank you for the quote. I was reading the same section 3 earlier, but did not get to this §. Your fix will work.Though my question whether it is possible to modify the original data in dat, without taking a copy of the part of the array. I need the order of the original to be preserved, while modify only subset.Will simple iterative approach work? Or is there anything nicer?
Denis C
To modify dat itself, I know nothing nicer than the "simple iterative approach" (generally on the .flat view, as that IS indeed a view!-). BTW, SO etiquette is that you upvote good answers, and accept one that does answer your question or fix your problem -- since you comment "your fix will work" by that etiquette you should upvote and accept (and probably open another question specifically about selective modification in the main array itself -- somebody might give you a better approach than iteration if they saw that question, but may not notice the question if you just ask it in a comment!)
Alex Martelli
I did accept the solution, because it is indeed the solution to the described problem. I can't up-vote, as I don't have enough rating yet :). Thank you for your interest in the problem.
Denis C