It appears you want to delete a row of your array in-place, however, this is not possible using the np.delete
function, as such an operation goes against the way that Python and Numpy manage memory.
I found an interesting post on the Numpy mailing list (Travis Oliphant, [Numpy-discussion] Deleting a row from a matrix) where the np.delete
function is first discussed:
So, "in-place" deletion of array
objects would not be particularly
useful, because it would only work for
arrays with no additional reference
counts (i.e. simple b=a assignment
would increase the reference count and
make it impossible to say del a[obj]).
....
But, the problem with both of those
approaches is that once you start
removing arbitrary rows (or n-1
dimensional sub-spaces) from an array
you very likely will no longer have a
chunk of memory that can be described
using the n-dimensional array memory
model.
If you take a look at the documentation for np.delete
(http://docs.scipy.org/doc/numpy/reference/generated/numpy.delete.html), we can see that the function returns a new array with the desired parts (not necessarily rows) deleted.
Definition: np.delete(arr, obj, axis=None)
Docstring:
Return a new array with sub-arrays along an axis deleted.
Parameters
----------
arr : array_like
Input array.
obj : slice, int or array of ints
Indicate which sub-arrays to remove.
axis : int, optional
The axis along which to delete the subarray defined by `obj`.
If `axis` is None, `obj` is applied to the flattened array.
Returns
-------
out : ndarray
A copy of `arr` with the elements specified by `obj` removed. Note
that `delete` does not occur in-place. If `axis` is None, `out` is
a flattened array.
So, in your case I think you'll want to do something like:
A = array([['id1', '1', '2', 'NaN'],
['id2', '2', '0', 'NaN']])
li = ['id1', 'id3', 'id6']
for i, row in enumerate(A):
if row[0] not in li:
A = np.delete(A, i, axis=0)
A
is now cut down as you wanted, but remember it is a new piece of memory. Each time np.delete
is called new memory is allocated which the name A
will point to.
I'm sure there is a better vectorized way (maybe using masked arrays?) to find out which rows to delete, but I couldn't get it together. If anyone has it though please comment!