views:

191

answers:

4

I have a 2D numpy array of shape (N,2) which is holding N points (x and y coordinates). For example:

array([[3, 2],
       [6, 2],
       [3, 6],
       [3, 4],
       [5, 3]])

I'd like to sort it such that my points are ordered by x-coordinate, and then by y in cases where the x coordinate is the same. So the array above should look like this:

array([[3, 2],
       [3, 4],
       [3, 6],
       [5, 3],
       [6, 2]])

If this was a normal Python list, I would simply define a comparator to do what I want, but as far as I can tell, numpy's sort function doesn't accept user-defined comparators. Any ideas?


EDIT: Thanks for the ideas! I set up a quick test case with 1000000 random integer points, and benchmarked the ones that I could run (sorry, can't upgrade numpy at the moment).

Mine:   4.078 secs 
mtrw:   7.046 secs
unutbu: 0.453 secs
+1  A: 

EDIT: removed bad answer.

Here's one way to do it using an intermediate structured array:

from numpy import array

a = array([[3, 2], [6, 2], [3, 6], [3, 4], [5, 3]])

b = a.flatten()
b.dtype = [('x', '<i4'), ('y', '<i4')]
b.sort()
b.dtype = '<i4'
b.shape = a.shape

print b

which gives the desired output:

[[3 2]
 [3 4]
 [3 6]
 [5 3]
 [6 2]]

Not sure if this is quite the best way to go about it though.

ars
That doesn't quite work, because it loses the association between x and y for my points.
perimosocordiae
Oh, you're right; sorry. Updated my answer.
ars
Hm. When I run that, I get an error on the `b.shape = a.shape` line: "ValueError: total size of new array must be unchanged". I'm running Python 2.6.2, with numpy 1.2.1.
perimosocordiae
I'm running Python 2.5.4 with numpy 1.3.0. Try upgrading the version of numpy.
ars
A: 

I found one way to do it:

from numpy import array
a = array([(3,2),(6,2),(3,6),(3,4),(5,3)])
array(sorted(sorted(a,key=lambda e:e[1]),key=lambda e:e[0]))

It's pretty terrible to have to sort twice (and use the plain python sorted function instead of a faster numpy sort), but it does fit nicely on one line.

perimosocordiae
+5  A: 

Using lexsort:

ind=np.lexsort((a[:,1],a[:,0]))

a[ind]
# array([[3, 2],
#       [3, 4],
#       [3, 6],
#       [5, 3],
#       [6, 2]])
unutbu
Ah, I'd seen lexsort in the docs, but I couldn't figure out how it would apply to this problem. Thanks!
perimosocordiae
Yes, I often have difficulty understanding documentation. Examples tend to be far more illuminating. The trouble is, after playing with examples, I reread the docs and find out the docs were perfectly clear... :-)
unutbu
+1  A: 

You can use np.complex_sort. This has the side effect of changing your data to floating point, I hope that's not a problem:

>>> a = np.array([[3, 2], [6, 2], [3, 6], [3, 4], [5, 3]])
>>> atmp = np.sort_complex(a[:,0] + a[:,1]*1j)
>>> b = np.array([[np.real(x), np.imag(x)] for x in atmp])
>>> b
array([[ 3.,  2.],
       [ 3.,  4.],
       [ 3.,  6.],
       [ 5.,  3.],
       [ 6.,  2.]])
mtrw
I think you win the cleverness award; I wouldn't have thought of making the y-coordinates imaginary!
perimosocordiae
But dog slow! Sorry, I didn't really consider performance when I posted this.
mtrw