views:

91

answers:

2

I have a dictionary whose keys are strings and values are numpy arrays, e.g.:

data = {'a': array([1,2,3]), 'b': array([4,5,6]), 'c': array([7,8,9])}

I want to compute a statistic between all pairs of values in 'data' and build an n by x matrix that stores the result. Assume that I know the order of the keys, i.e. I have a list of "labels":

labels = ['a', 'b', 'c']

What's the most efficient way to compute this matrix?

I can compute the statistic for all pairs like this:

result = []
for elt1, elt2 in itertools.product(labels, labels):
  result.append(compute_statistic(data[elt1], data[elt2]))

But I want result to be a n by n matrix, corresponding to "labels" by "labels". How can I record the results as this matrix? thanks.

+2  A: 

You could use a nested loop, or a list comprehension like:

result = [[compute_stat(data[row], data[col]) for col in labels]
          for row in labels]
Alex Martelli
+2  A: 

Convert the result list into a matrix and then adjust the shape.

myMatrix = array(result) # or use matrix(result)
myMatrix.shape = (len(labels), len(labels))

If you want to index the matrix with the labels you could do

myMatrix[labels.index('a'), labels.index('b')]

This gets the a*b value. If this is your intention it would be better to store the indexes in a dictionary.

labelsIndex = {'a' : 0, 'b' : 1, 'c' : 2 }
myMatrix[labelsIndex['a'], labelsIndex['b']]

Hope this helps.

tdedecko