Is there an easy way to add a record/row to a numpy recarray without creating a new recarray? Let's say I have a recarray that takes 1Gb in memory, I want to be able to add a row to it without having python take up 2Gb of memory temporarily.
+5
A:
You can call yourrecarray.resize
with a shape which has one more row, then assign to that new row. Of course. numpy
might still have to allocate completely new memory if it just doesn't have room to grow the array in-place, but at least you stand a chance!-)
Since an example was requested, here comes, modified off the canonical example list...:
>>> import numpy
>>> mydescriptor = {'names': ('gender','age','weight'), 'formats': ('S1', 'f4', 'f4')}
>>> a = numpy.array([('M',64.0,75.0),('F',25.0,60.0)], dtype=mydescriptor)
>>> print a
[('M', 64.0, 75.0) ('F', 25.0, 60.0)]
>>> a.shape
(2,)
>>> a.resize(3)
>>> a.shape
(3,)
>>> print a
[('M', 64.0, 75.0) ('F', 25.0, 60.0) ('', 0.0, 0.0)]
>>> a[2] = ('X', 17.0, 61.5)
>>> print a
[('M', 64.0, 75.0) ('F', 25.0, 60.0) ('X', 17.0, 61.5)]
Alex Martelli
2009-10-21 01:32:54
Could you show some demonstration code? My attempt at calling arr.resize() ended withValueError: cannot resize this array: it does not own its data
unutbu
2009-10-21 02:10:25
@unutbu, sure, edited answer to supply simple example. You may be meeting issues discussed in this thread: http://aspn.activestate.com/ASPN/Mail/Message/numpy-discussion/3042521 -- then you can fix them, as Travis Oliphant mentions there, by adding the refcheck=0 argument to the resize call (unless you HAVE shared the data, in which case there can be no resizing in-place any more (note that what Travis mentions as a feature of the SVN head of numpy has been part of regularly released numpy for a long time by now -- that thread is 3+ years old;-).
Alex Martelli
2009-10-21 02:22:02
Thank you! a.resize(3,refcheck=0) did the trick for me.
unutbu
2009-10-21 02:57:05
I wish the numpy's developers had thought of a better way to add a row to a dataset. It is a very common operation and I don't understand why it should be so inefficient.
dalloliogm
2010-08-03 10:30:30