views:

4323

answers:

6

I've got a list of Python objects that I'd like to sort by an attribute of the objects themselves. The list looks like:

>>> ut
[<Tag: 128>, <Tag: 2008>, <Tag: <>, <Tag: actionscript>, <Tag: addresses>, <Tag: aes>, <Tag: ajax> ...]

Each object has a count:

>>> ut[1].count
1L

I need to sort the list by number of counts descending.

I've seen several methods for this, but I'm looking for best practice in Python.

+35  A: 
Triptych
Perfect. Thanks!
Nick Sergeant
No problem. btw, if muhuk is right and it's a list of Django objects, you should consider his solution. However, for the general case of sorting objects, my solution is probably best practice.
Triptych
On large lists you will get better performance using operator.attrgetter('count') as your key. This is just an optimized (lower level) form of the lambda function in this answer.
David Eyk
+1  A: 

Add rich comparison operators to the object class, then use sort() method of the list.
See rich comparison in python.


Update: Although this method would work, I think solution from Triptych is better suited to your case because way simpler.

Roberto Liffredo
+1. Not the best solution in this case, but good to know this way is possible as well.
Triptych
+4  A: 

It looks much like a list of Django ORM model instances.

Why not sort them on query like this:

ut = Tag.objects.order_by('-count')
muhuk
It is, but using django-tagging, so I was using a built-in for grabbing a Tag set by usage for a particular query set, like so:Tag.objects.usage_for_queryset(QuerySet, counts=True)
Nick Sergeant
+9  A: 

A way that can be fastest, especially if your list has a lot of records, is to use operator.attrgetter("count"). However, this might run on an pre-operator version of Python, so it would be nice to have a fallback mechanism. You might want to do the following, then:

try: import operator
except ImportError: cmpfun= lambda x: x.count # use a lambda if no operator module
else: cmpfun= operator.attrgetter("count") # use operator since it's faster than lambda

ut.sort(key=cmpfun, reverse=True) # sort in-place
ΤΖΩΤΖΙΟΥ
Here I would use the variable name "keyfun" instead of "cmpfun" to avoid confusion. The sort() method does accept a comparison function through the cmp= argument as well.
akaihola
+1  A: 
from operator import attrgetter
ut.sort(key = attrgetter('count'), reverse = True)
+2  A: 

Readers should notice that the key= method:

ut.sort(key=lambda x: x.count, reverse=True)

is many times faster than adding rich comparison operators to the objects. I was surprised to read this (page 485 of "Python in a Nutshell"). You can confirm this by running tests on this little program:

#!/usr/bin/env python
import random

class C:
    def __init__(self,count):
        self.count = count

    def __cmp__(self,other):
        return cmp(self.count,other.count)

longList = [C(random.random()) for i in xrange(1000000)] #about 6.1 secs

longList.sort() #about 52 - 6.1 = 46 secs
longList.sort(key = lambda c: c.count) #about 9 - 6.1 = 3 secs

My, very minimal, tests show the first sort is more than 10 times slower, but the book says it is only about 5 times slower in general. The reason they say is due to the highly optimizes sort algorithm used in python (timsort).

Still, its very odd that .sort(lambda) is faster than plain old .sort(). I hope they fix that.

Jose M Vidal