views:

331

answers:

3

Given an array 'a' I would like to sort the array by columns "sort(a, axis=0)" do some stuff to the array and then undo the sort. By that I don't mean re sort but basically reversing how each element was moved. I assume argsort() is what I need but it is not clear to me how to sort an array with the results of argsort() or more importantly apply the reverse/inverse of argsort()

Here is a little more detail

I have an array a, shape(a) = rXc I need to sort each column

aargsort = a.argsort(axis=0)  # May use this later
aSort = a.sort(axis=0)

now average each row

aSortRM = asort.mean(axis=1)

now replace each col in a row with the row mean. is there a better way than this

aWithMeans = ones_like(a)
for ind in range(r)  # r = number of rows
    aWithMeans[ind]* aSortRM[ind]

Now I need to undo the sort I did in the first step. ????

+1  A: 

I'm not sure how best to do it in numpy, but, in pure Python, the reasoning would be:

aargsort is holding a permutation of range(len(a)) telling you where the items of aSort came from -- much like, in pure Python:

>>> x = list('ciaobelu')
>>> r = range(len(x))
>>> r.sort(key=x.__getitem__)
>>> r
[2, 4, 0, 5, 1, 6, 3, 7]
>>> 

i.e., the first argument of sorted(x) will be x[2], the second one x[4], and so forth.

So given the sorted version, you can reconstruct the original by "putting items back where they came from":

>>> s = sorted(x)
>>> s
['a', 'b', 'c', 'e', 'i', 'l', 'o', 'u']
>>> original = [None] * len(s)
>>> for i, c in zip(r, s): original[i] = c
... 
>>> original
['c', 'i', 'a', 'o', 'b', 'e', 'l', 'u']
>>> 

Of course there are going to be tighter and faster ways to express this in numpy (which unfortunately I don't know inside-out as much as I know Python itself;-), but I hope this helps by showing the underlying logic of the "putting things back in place" operation you need to perform.

Alex Martelli
A: 

I was not able to follow your example, but the more abstract problem--i.e., how to sort an array then reverse the sort--is straightforward.

import numpy as NP
# create an 10x6 array to work with
A = NP.random.randint(10, 99, 60).reshape(10, 6)
# for example, sort this array on the second-to-last column, 
# breaking ties using the second column (numpy requires keys in
# "reverse" order for some reason)
keys = (A[:,1], A[:,4])
ndx = NP.lexsort(keys, axis=0)
A_sorted = NP.take(A, ndx, axis=0)

To "reconstruct" A from A_sorted is trivial because remember that you used an index array ('ndx') to sort the array in the first place.

# ndx array for example above:  array([6, 9, 8, 0, 1, 2, 4, 7, 3, 5])

In other words, the 4th row in A_sorted was the 1st row in the original array, A, etc.

doug
I actually want to sort each column individually, I correct my code at the top but I need to work with np.sort(A, axis=0) so the inex would be np.argsort(x, axis=0)
Vincent
A: 

There are probably better solutions to the problem you are actually trying to solve than this (performing an argsort usually precludes the need to actually sort), but here you go:

>>> import numpy as np
>>> a = np.random.randint(0,10,10)
>>> aa = np.argsort(a)
>>> aaa = np.argsort(aa)
>>> a # original
array([6, 4, 4, 6, 2, 5, 4, 0, 7, 4])
>>> a[aa] # sorted
array([0, 2, 4, 4, 4, 4, 5, 6, 6, 7])
>>> a[aa][aaa] # reversed
array([6, 4, 4, 6, 2, 5, 4, 0, 7, 4])
bpowah