ansaurus

Question

pythonic way to aggregate arrays (numpy or not)

Answer 1

A:

http://python.net/~goodger/projects/pycon/2007/idiomatic/handout.html#dictionary-get-method

should help to make it a little prettier, more pythonic, more efficient possibly. I'll come back later to check on your progress. Maybe you can edit the function with this in mind? Also see the next couple of sections.

skyl 2009-12-01 22:51:01

Answer 2

+2 A:

Your if k not in data_per_key.keys() could be rewritten as if k not in data_per_key, but you can do even better with defaultdict. Here's a version that uses defaultdict to get rid of the existence check:

import collections

def aggregate(data, key, value, func):
    data_per_key = collections.defaultdict(list)
    for k,v in zip(data[key], data[value]):
        data_per_key[k].append(v)

    return [(k,func(data_per_key[k])) for k in data_per_key.keys()]

Hank Gay 2009-12-01 22:51:37

I'd change the last line to `return [(k,f(v)) for k,v in data_per_key.items()]`

gnibbler 2009-12-01 23:09:17

That's a good call, but I was trying to highlight the `defaultdict` stuff by making that the only change. Your return is definitely better, though.

Hank Gay 2009-12-02 11:54:36

thanks for the defaultdict trick! and also for the final iteration

Louis 2009-12-02 20:12:41

Answer 3

A:

Perhaps the function you are seeking is matplotlib.mlab.rec_groupby:

import matplotlib.mlab

data=np.array(
    [('Aaron','Digger',1),
     ('Bill','Planter',2),
     ('Carl','Waterer',3),
     ('Darlene','Planter',3),
     ('Earl','Digger',7)],
    dtype=[('name', np.str_,8), ('job', np.str_,8), ('income', np.uint32)])

result=matplotlib.mlab.rec_groupby(data, ('job',), (('income',np.mean,'avg_income'),))

yields

('Digger', 4.0)
('Planter', 2.5)
('Waterer', 3.0)

matplotlib.mlab.rec_groupby returns a recarray:

print(result.dtype)
# [('job', '|S7'), ('avg_income', '<f8')]

unutbu 2009-12-02 00:09:47

that's exactly what I was looking for: the job done in one line! Moreover it's returning directly an array! Perfect!

Louis 2009-12-02 20:11:50

ansaurus

tags:

views:

answers:

pythonic way to aggregate arrays (numpy or not)

related questions