ansaurus

Question

Answer 1

+4 A:

Some example data:

import numpy as np

lookup = np.array([[  1.     ,   3.14   ,   4.14   ],
                   [  2.     ,   2.71818,   3.7    ],
                   [  3.     ,  42.     ,  43.     ]])

a = np.array([[ 1, 11],
              [ 1, 12],
              [ 2, 21],
              [ 3, 31]])

Build a dictionary from key to row number in the lookup table:

mapping = dict(zip(lookup[:,0], range(len(lookup))))

Then you can use the dictionary to match up lines. For instance, if you just want to join the tables:

>>> np.hstack((a, np.array([lookup[mapping[key],1:] 
                            for key in a[:,0]])))
array([[  1.     ,  11.     ,   3.14   ,   4.14   ],
       [  1.     ,  12.     ,   3.14   ,   4.14   ],
       [  2.     ,  21.     ,   2.71818,   3.7    ],
       [  3.     ,  31.     ,  42.     ,  43.     ]])

Vebjorn Ljosa 2010-08-19 15:22:26

+1, building an auxiliary dict is clearly the way to go.

Alex Martelli 2010-08-19 15:25:52

+1 for getting +1 from Alex Martelli ;) And for having a useful answer, of course.

Wayne Werner 2010-08-19 15:49:37

For whatever it's worth, there is a built-in numpy function to do this: `numpy.lib.recfunctions.join_by`. http://projects.scipy.org/numpy/browser/trunk/numpy/lib/recfunctions.py#L826 It's rather clunky if you're not already using structured arrays, though.

Joe Kington 2010-08-19 15:53:26

Answer 2

+1 A:

In the special case when the index can be calculated from the keys, the dictionary can be avoided. It's an advantage when the key of the lookup table can be chosen.

For Vebjorn Ljosa's example:

lookup:

>>> lookup[a[:,0]-1, :]
array([[  1.     ,   3.14   ,   4.14   ],
       [  1.     ,   3.14   ,   4.14   ],
       [  2.     ,   2.71818,   3.7    ],
       [  3.     ,  42.     ,  43.     ]])

merge:

>>> np.hstack([a, lookup[a[:,0]-1, :]])
array([[  1.     ,  11.     ,   1.     ,   3.14   ,   4.14   ],
       [  1.     ,  12.     ,   1.     ,   3.14   ,   4.14   ],
       [  2.     ,  21.     ,   2.     ,   2.71818,   3.7    ],
       [  3.     ,  31.     ,   3.     ,  42.     ,  43.     ]])

2010-08-22 15:10:05

ansaurus

tags:

views:

answers:

Using Numpy arrays as lookup tables

related questions