tags:

views:

7656

answers:

7

I have a dictionary of 200,000 items (the keys are strings and the values are integers).

What is the best/most pythonic way to print the items sorted by descending value then ascending key (i.e. a 2 key sort)?

 a={ 'keyC':1, 'keyB':2, 'keyA':1 }
b = a.items()
b.sort( key=lambda a:a[0])
b.sort( key=lambda a:a[1], reverse=True )
print b
>>>[('keyB', 2), ('keyA', 1), ('keyC', 1)]
+6  A: 

You can't sort dictionaries. You have to sort the list of items.

Previous versions were wrong. When you have a numeric value, it's easy to sort in reverse order. These will do that. But this isn't general. This only works because the value is numeric.

a = { 'key':1, 'another':2, 'key2':1 }

b= a.items()
b.sort( key=lambda a:(-a[1],a[0]) )
print b

Here's an alternative, using an explicit function instead of a lambda and the cmp instead of the key option.

def valueKeyCmp( a, b ):
    return cmp( (-a[1], a[0]), (-b[1], b[0] ) )

b.sort( cmp= valueKeyCmp )
print b

The more general solution is actually two separate sorts

b.sort( key=lambda a:a[1], reverse=True )
b.sort( key=lambda a:a[0] )
print b
S.Lott
Thank you but this doesn't address the sort order or the 2 key nature of the sort.
Yep. Figured that out. Sorry. Reposted with corrections.
S.Lott
The third example doesn't do a 2 key sort, the second sort undoes the first one
Ricardo Reyes
@Ricardo Reyes: the python sort is a stable sort. When sorting first on lesser keys, the final result is achieved.
ΤΖΩΤΖΙΟΥ
@Ricardo Reyes: http://docs.python.org/lib/typesseq-mutable.html
S.Lott
@S.Lott Thank you. One point.. b.sort( key=lambda a:(-a[1],a[0]) ) gives me "TypeError: bad operand type for unary -: 'str'" whereas b.sort( key=lambda a:(a[1]*-1,a[0]) ) works fine. I don't understand why.
@monty: your values aren't numbers, they're strings. Check the repr() of your dictionary to see if you have { 'key':'1' } or not. You might be happier lambda a:(-int(a[1]), a[0]) which forces a conversion to int.
S.Lott
+1  A: 

The most pythonic way to do it would be to know a little more about the actual data -- specifically, the maximum value you can have -- and then do it like this:

def sortkey((k, v)): 
    return (maxval - v, k)

items = thedict.items()
items.sort(key=sortkey)

but unless you already know the maximum value, searching for the maximum value means looping through the dict an extra time (with max(thedict.itervalues())), which may be expensive. Alternatively, a keyfunc version of S.Lott's solution:

def sortkey((k, v)): 
    return (-v, k)

items = thedict.items()
items.sort(key=sortkey)

An alternative that doesn't care about the types would be a comparison function:

def sortcmp((ak, av), (bk, bv)):
    # compare values 'in reverse'  
    r = cmp(bv, av)
    if not r:
        # and then keys normally
        r = cmp(ak, bk)
    return r

items = thedict.items()
items.sort(cmp=sortcmp)

and this solution actually works for any type of key and value that you want to mix ascending and descending sorting with in the same key. If you value brevity you can write sortcmp as:

def sortcmp((ak, av), (bk, bv)):
    return cmp((bk, av), (ak, bv))
Thomas Wouters
Knowing the maximum value is useless; use any value as max value. 0 comes to mind instantly, as S.Lott suggested. ;)
ΤΖΩΤΖΙΟΥ
+1  A: 

You can use something like this:

dic = {'aaa':1, 'aab':3, 'aaf':3, 'aac':2, 'aad':2, 'aae':4}

def sort_compare(a, b):
 c = cmp(dic[b], dic[a])
 if c != 0:
  return c
 return cmp(a, b)

for k in sorted(dic.keys(), cmp=sort_compare):
 print k, dic[k]

Don't know how pythonic it is however :)

rslite
+5  A: 
data = { 'keyC':1, 'keyB':2, 'keyA':1 }

for key, value in sorted(data.items(), key=lambda x: (-1*x[1], x[0])):
    print key, value
Ricardo Reyes
This is the most pythonic solution in my opinion and the easiest to understand.
Jesse Shieh
A: 

Building on Thomas Wouters and Ricardo Reyes solutions:

def combine(*cmps):
    """Sequence comparisons."""
    def comparator(a, b):
        for cmp in cmps:
            result = cmp(a, b):
            if result:
                return result
        return 0
    return comparator

def reverse(cmp):
    """Invert a comparison."""
    def comparator(a, b):
        return cmp(b, a)
    return comparator

def compare_nth(cmp, n):
    """Compare the n'th item from two sequences."""
    def comparator(a, b):
        return cmp(a[n], b[n])
    return comparator

rev_val_key_cmp = combine(
        # compare values, decreasing
        reverse(compare_nth(1, cmp)),

        # compare keys, increasing
        compare_nth(0, cmp)
    )

data = { 'keyC':1, 'keyB':2, 'keyA':1 }

for key, value in sorted(data.items(), cmp=rev_val_key_cmp):
    print key, value
MizardX
A: 
>>> keys = sorted(a, key=lambda k: (-a[k], k))

or

>>> keys = sorted(a)
>>> keys.sort(key=a.get, reverse=True)

then

print [(key, a[key]) for key in keys]
[('keyB', 2), ('keyA', 1), ('keyC', 1)]
Coady
A: 

See my answer to a related question here.

hughdbrown