views:

137

answers:

2

I have an array of points in numpy:

points = rand(dim, n_points)

And I want to:

  1. Calculate all the l2 norm (euclidian distance) between a certain point and all other points
  2. Calculate all pairwise distances.

and preferably all numpy and no for's. How can one do it?

+1  A: 

This might help with the second part:

import numpy as np
from numpy import *
p=rand(3,4) # this is column-wise so each vector has length 3
sqrt(sum((p[:,np.newaxis,:]-p[:,:,np.newaxis])**2 ,axis=0) )

which gives

array([[ 0.        ,  0.37355868,  0.64896708,  1.14974483],
   [ 0.37355868,  0.        ,  0.6277216 ,  1.19625254],
   [ 0.64896708,  0.6277216 ,  0.        ,  0.77465192],
   [ 1.14974483,  1.19625254,  0.77465192,  0.        ]])

if p was

array([[ 0.46193242,  0.11934744,  0.3836483 ,  0.84897951],
   [ 0.19102709,  0.33050367,  0.36382587,  0.96880535],
   [ 0.84963349,  0.79740414,  0.22901247,  0.09652746]])

and you can check one of the entries via

sqrt(sum ((p[:,0]-p[:,2] )**2 ))
0.64896708223796884

The trick is to put newaxis and then do broadcasting.

Good luck!

reckoner
+2  A: 

If you're willing to use SciPy, the scipy.spatial.distance module (the functions cdist and/or pdist) do exactly what you want, with all the looping done in C. You can do it with broadcasting too but there's some extra memory overhead.

dwf