views:

39

answers:

3

I wish to hold a heap of objects, not just numbers. They will have an integer attribute in them that the heap can sort by. The easiest way to use heaps in python is heapq, but how do I tell it to sort by a specific attribute when using heapq?

A: 

Unfortunately, you can't, although this is an often requested feature.

One option would be to insert (key, value) tuples into the heap. However, that won't work if the values throw an exception when compared (they will be compared in the case of a tie between keys).

A second option would be to define a __lt__ (less-than) method in the class that will use the appropriate attribute to compare the elements for sorting. However, that might not be possible if the objects were created by another package or if you need them to compare differently elsewhere in the program.

A third option would be to use the sortedlist class from the blist module (disclaimer: I'm the author). The constructor for sortedlist takes a key parameter that lets you specify a function to return the sort key of an element, similar to the key parameter of list.sorted and sorted.

Daniel Stutzbach
I removed my previous comment since my issue with `blist` was probably a PEBCAK (again thanks for your module), so I only duplicate the first part of the previous comment: It's always possible to define a class with an `__lt__` either through subclassing or through encapsulation.
ΤΖΩΤΖΙΟΥ
+1  A: 

heapq sorts objects the same way list.sort does, so just define a method __cmp__() within your class definition, which will compare itself to another instance of the same class:

def __cmp__(self, other):
    return cmp(self.intAttribute, other.intAttribute)

Works in Python 2.x.

In 3.x use:

def __lt__(self, other):
    return self.intAttribute < other.intAttribute
eumiro
`__cmp__` is gone in 3.x. Use `__lt__` instead.
Daniel Stutzbach
@Daniel, thanks
eumiro
`__lt__` works in Python 2 also, so it's better to just avoid `__cmp__` altogether.
Daniel Stutzbach
Just as you can tell any sort to sort based on a criteria other than the object's natural sorting (eg. `cmp` and `key` for `sort`), you should be able to tell `heapq` to sort based on a different key. In other words, you shouldn't have to *redefine the object itself* to change a particular data structure holding it; you should be able to just tell the data structure itself. This is a notable fundamental piece missing from the `heapq` API.
Glenn Maynard
A: 

According to 8.4.1 of http://docs.python.org/library/heapq.html, you can use tuples, and it will sort by the first element of the tuple:

>>> h = []
>>> heappush(h, (5, 'write code'))
>>> heappush(h, (7, 'release product'))
>>> heappush(h, (1, 'write spec'))
>>> heappush(h, (3, 'create tests'))
>>> heappop(h)
(1, 'write spec')

So if you don't want to (or can't?) do a __cmp__ method, you can manually extract your sorting key at push time.

Jander