views:

1014

answers:

3

Still learning python (finally!) and can't quite wrap my head around this one yet. What I want to do is sort a dictionary of lists by value using the third item in the list. It's easy enough sorting a dictionary by value when the value is just a single number or string, but this list thing has me baffled.

Example:

myDict = { 'item1' : [ 7, 1, 9], 'item2' : [8, 2, 3], 'item3' : [ 9, 3, 11 ] }

I want to be able to iterate through the dictionary in order of the third value in each list, in this case 9, 3, 11.

Thanks much for any help!

+14  A: 

Here is one way to do this:

>>> sorted(myDict.items(), key=lambda e: e[1][2])
[('item2', [8, 2, 3]), ('item1', [7, 1, 9]), ('item3', [9, 3, 11])]

The key argument of the sorted function lets you derive a sorting key for each element of the list.

To iterate over the keys/values in this list, you can use something like:

>>> for key, value in sorted(myDict.items(), key=lambda e: e[1][2]):
...   print key, value
... 
item2 [8, 2, 3]
item1 [7, 1, 9]
item3 [9, 3, 11]
Ayman Hourieh
As soon as I asked the question I had an epiphany and basically came up with the same thing except for the lambda (haven't learned about them yet). Just wrote my own cmp function that takes in tupples from dict.items() and returns the result. Same thing, just a different way to write it. Thanks much for the quick reply!
jay
Great solution. I love the simplicity of `sorted()`.
Evan Fosmark
I think it is a little clearer this way: sorted(myDict.items(), key=lambda (k, v): v[2])
Roberto Bonvallet
@jay, key= is much better than cmp= performance-wise -- AND, SO etiquette suggests you should ACCEPT this answer rather than just expressing thanks for it verbally!!!
Alex Martelli
"except for the lambda (haven't learned about them yet" Good point. Avoid lambdas where possible. This can be done with an ordinary function def, which is usually much more clear than a lambda.
S.Lott
@jay: cmp= is no longer supported in Python 3
newacct
+1  A: 

You stated two quite different wants: "What I want to do is sort a dictionary of lists ..." and "I want to be able to iterate through the dictionary in order of ..."

The first of those is by definition impossible -- to sort something implies a rearrangement in some order. Python dictionaries are inherently unordered. The second would be vaguely possible but extremely unlikely to be implemented.

What you can do is (1) take a copy of the dictionary contents (which will be quite unordered) (2) sort that (3) iterate over the sorted results -- and you already have two solutions for that. By the way, the solution that uses "key" instead of "cmp" is better; see

http://docs.python.org/library/functions.html#sorted

"the third item in the list" smells like "the third item in a tuple" to me, and "e[1][2]" just smells :-) ... you may like to investigate using named tuples instead of lists; see

http://docs.python.org/library/collections.html#namedtuple-factory-function-for-tuples-with-named-fields

If you are going to be doing extract/sort/process often on large data sets, you might like to consider something like this, using the Python-supplied sqlite3 module:

create table ex_dict (k text primary key, v0 int, v1 int, v2 int);
insert into ex_dict values('item1', 7, 1, 9);
-- etc etc 
select * from ex_dict order by v2;
John Machin
A: 

As John Machlin said you can't actually sort a Python dictionary.

However, you can create an index of the keys which can be sorted in any order you like.

The preferred Python pattern (idiom) for sorting by any alternative criterium is called "decorate-sort-undecorate" (DSU). In this idiom you create a temporary list which contains tuples of your key(s) followed by your original data elements, then call the normal .sort() method on that list (or, in more recent versions of Python simply wrap your decoration in a called to the sorted() built-in function). Then you remove the "decorations."

The reason this is generally preferred over passing comparison function to the .sort() method is that Python's built-in default sorting code (compiled C in the normal C Python) is very fast and efficient in the default case, but much, much slower when it has to call Python object code many, many times in the non-default case. So it's usually far better to iterate over the data creating data structures which can be passed to the default sort routines.

In this case you should be able to use something like:

[y[1] for y in sorted([(myDict[x][2], x) for x in myDict.keys()])]

... that's a list comprehension doing the undecorate from the sorted list of tuples which is being returned by the inner list comprehension. The inner comprehension is creating a set of tuples, your desired sorting key (the 3rd element of the list) and the dictionary's key corresponding to the sorting key. myDict.keys() is, of course, a method of Python dictionaries which returns a list of all valid keys in whatever order the underlying implementation chooses --- presumably a simple iteration over the hashes.

A more verbose way of doing this might be easier to read:

temp = list()
for k, v in myDict.items():
    temp.append((v[2],))
temp.sort()
results = list()
for i in temp:
    results.append(i[1])

Usually you should built up such code iteratively, in the interpreter using small data samples. Build the "decorate" expression or function. Then wrap that in a call to sorted(). Then build the undecorate expression (which is usually as simple as what I've shown here).

Jim Dennis
(1) You compare decorate-sort-undecorate with using the cmp arg; introduction of the key arg chopped off a very large slice of DSU's territory. (2) Your solution leaves the OP with a list of the dict keys ... to get what he wants, he'll have to do yet another loop of the dict items (3) your verbose way has a typo: s/v[2],/v[2], k/
John Machin