views:

230

answers:

3

For the sake of simplicity, let's say I have a Person class in Python. This class has fields for firstname, lastname, and dob.

class Person:
  def __init__(self, firstname, lastname, dob):
    self.firstname = firstname;
    self.lastname = lastname;
    self.dob = dob;

In some situations I want to sort lists of Persons by lastname followed by firstname followed by dob. In other situations I want to sort first by dob, then by lastname and finally by firstname. And sometimes I just want to sort by firstname.

The naive solution for creating the first comparison function would be something like this:

def comparepeople(person1, person2):
  if cmp(person1.lastname, person2.lastname) == 0:
    if cmp(person1.firstname, person2.firstname) == 0:
      return cmp(person1.dob, person2.dob);
    return cmp(person1.firstname, person2.firstname);
  return cmp(person1.lastname, person2.lastname);

It would seem like there should be an easy way to define comparison functions like these using a meta-programming approach where all I would need to do is provide the field names in order of precedence instead of writing these very verbose, ugly comparison methods. But I've only started playing with Python recently and haven't found anything like what I'm describing.

So the question is, what is the most Pythonic way to write a comparison function for a class with multiple comparable constituent members?

+4  A: 

Here's one way (maybe not the fastest):

def compare_people_flexibly(p1, p2, attrs):
    """Compare `p1` and `p2` based on the attributes in `attrs`."""
    v1 = [getattr(p1, a) for a in attrs]
    v2 = [getattr(p2, a) for a in attrs]
    return cmp(v1, v2)

def compare_people_firstname(p1, p2):
    return compare_people_flexibly(p1, p2, ['firstname', 'lastname', 'dob'])

def compare_people_lastname(p1, p2):
    return compare_people_flexibly(p1, p2, ['lastname', 'firstname', 'dob'])

This works because getattr can be used to get attributes named by a string, and because Python compares lists as you'd expect, based on the comparison of the first non-equal items.

Another way:

def compare_people_flexibly(p1, p2, attrs):
    """Compare `p1` and `p2` based on the attributes in `attrs`."""
    for a in attrs:
        c = cmp(getattr(p1, a), getattr(p2, a))
        if c:
            return c
    return 0

This has the advantage that it doesn't build two complete lists of attributes, so may be faster if the attribute lists are long, or if many comparisons complete on the first attribute.

Lastly, as Martin mentions, you may need a key function rather than a comparison function:

def flexible_person_key(attrs):
    def key(p):
        return [getattr(p, a) for a in attrs]
    return key

l.sort(key=flexible_person_key('firstname', 'lastname', 'dob'))
Ned Batchelder
flexible_person_key = operator.attrgetter ;)
Roberto Bonvallet
Yep, since Python 2.5 attrgetter can be called with multiple arguments to get a tuple with multiple attributes.
Alex Martelli
I never knew that! Thanks for the lesson.
Ned Batchelder
+10  A: 

If you really want a comparison function, you can use

def comparepeople(p1, p2):
    o1 = p1.lastname, p1.firstname, p1.dob
    o2 = p2.lastname, p2.firstname, p2.dob
    return cmp(o1,o2)

This relies on tuple comparison. If you want to sort a list, you shouldn't write a comparison function, though, but a key function:

l.sort(key=lambda p:(p.lastname, p.firstname, p.dob))

This has the advantage that it is a) shorter and b) faster, because each key gets computed only once (rather than tons of tuples being created in the comparison function during sorting).

Martin v. Löwis
As Roberto pointed out commenting on Ned's answer, `l.sort(operator.attrgetter('lastname', 'firstname', 'dob'))` would be even (marginally) faster (though not shorter;-).
Alex Martelli
A: 

Can you not use the comparison methods on the class, see __cmp__ and the other rich comparison methods...

Mark Streatfield
The point is, I need the comparison function to be different in different situations and even if I did implement __cmp__ on Person, I'd still have to use something like Ned describes for the implementation.
Mike Deck