tags:

views:

184

answers:

4

I've read this post and is hasn't ended up working for me.

Edit: the functionality I'm describing is just like the sorting function in Excel... if that makes it any clearer

Here's my situation, I have a tab-delimited text document. There are about 125,000 lines and 6 columns per line (columns are separated by a tab character). I've split the document into a two-dimension list.

I am trying to write a generic function to sort two-dimensional lists. Basically I would like to have a function where I can pass the big list, and the key of one or more columns I would like to sort the big list by. Obviously, I would like the first key passed to be the primary sorting point, then the second key, etc.

Still confuzzled?

Here's an example of what I would like to do.

Joel    18 Orange 1
Anna    17 Blue 2
Ryan    18 Green 3
Luke    16 Blue 1
Katy    13 Pink 5
Tyler   22 Blue 6
Bob 22 Blue 10
Garrett 24 Red 7
Ryan    18 Green 8
Leland  18 Yellow 9

Say I passed this list to my magical function, like so:

sortByColumn(bigList, 0)

Anna    17 Blue 2
Bob 22 Blue 10
Garrett 24 Red 7
Joel    18 Orange 1
Katy    13 Pink 5
Leland  18 Yellow 9
Luke    16 Blue 1
Ryan    18 Green 3
Ryan    18 Green 8
Tyler   22 Blue 6

and...

sortByColumn(bigList, 2, 3)

Luke    16 Blue 1
Anna    17 Blue 2
Tyler   22 Blue 6
Bob 22 Blue 10
Ryan    18 Green 3
Ryan    18 Green 8
Joel    18 Orange 1
Katy    13 Pink 5
Garrett 24 Red 7
Leland  18 Yellow 9

Any clues?

+2  A: 

This will sort by columns 2 and 3:

a.sort(key=operator.itemgetter(2,3))
interjay
+1  A: 

Make sure you have converted the numbers to ints, otherwise they will sort alphabetically rather than numerically

# Sort the list in place
def sortByColumn(A,*args):
    import operator
    A.sort(key=operator.itemgetter(*args))
    return A

or

# Leave the original list alone and return a new sorted one
def sortByColumn(A,*args):
    import opertator
    return sorted(A,key=operator.itemgetter(*args))
gnibbler
+5  A: 
import operator
def sortByColumn(bigList, *args)
    bigList.sort(key=operator.itemgetter(*args)) # sorts the list in place
Tendayi Mawushe
That's fabulous. I had never heard of itemgetter (or attrgetter, which I also now see).
Matt Anderson
That is Guido's time machine for you. http://catb.org/jargon/html/G/Guido.html
Tendayi Mawushe
This is exactly what I'm looking for. Thanks a lot!
Joel Verhagen
A: 

The key idea here (pun intended) is to use a key function that returns a tuple. Below, the key function is lambda x: (x[idx] for idx in args) x is set to equal an element of aList -- that is, a row of data. It returns a tuple of values, not just one value. The sort() method sorts according to the first element of the list, then breaks ties with the second, and so on. See http://wiki.python.org/moin/HowTo/Sorting#Sortingbykeys

#!/usr/bin/env python
import csv
def sortByColumn(aList,*args):
    aList.sort(key=lambda x: (x[idx] for idx in args))
    return aList

filename='file.txt'
def convert_ints(astr):
    try:
        return int(astr)
    except ValueError:
        return astr    
biglist=[[convert_ints(elt) for elt in line]
         for line in csv.reader(open(filename,'r'),delimiter='\t')]

for row in sortByColumn(biglist,0):
    print row

for row in sortByColumn(biglist,2,3):
    print row
unutbu
The numbers have to be converted to ints
gnibbler
Good point, gnibbler. Fixed.
unutbu