tags:

views:

1543

answers:

5

I have a list of dicts:

b = [{u'TOT_PTS_Misc': u'Utley, Alex', u'Total_Points': 96.0},
 {u'TOT_PTS_Misc': u'Russo, Brandon', u'Total_Points': 96.0},
 {u'TOT_PTS_Misc': u'Chappell, Justin', u'Total_Points': 96.0},
 {u'TOT_PTS_Misc': u'Foster, Toney', u'Total_Points': 80.0},
 {u'TOT_PTS_Misc': u'Lawson, Roman', u'Total_Points': 80.0},
 {u'TOT_PTS_Misc': u'Lempke, Sam', u'Total_Points': 80.0},
 {u'TOT_PTS_Misc': u'Gnezda, Alex', u'Total_Points': 78.0},
 {u'TOT_PTS_Misc': u'Kirks, Damien', u'Total_Points': 78.0},
 {u'TOT_PTS_Misc': u'Worden, Tom', u'Total_Points': 78.0},
 {u'TOT_PTS_Misc': u'Korecz, Mike', u'Total_Points': 78.0},
 {u'TOT_PTS_Misc': u'Swartz, Brian', u'Total_Points': 66.0},
 {u'TOT_PTS_Misc': u'Burgess, Randy', u'Total_Points': 66.0},
 {u'TOT_PTS_Misc': u'Smugala, Ryan', u'Total_Points': 66.0},
 {u'TOT_PTS_Misc': u'Harmon, Gary', u'Total_Points': 66.0},
 {u'TOT_PTS_Misc': u'Blasinsky, Scott', u'Total_Points': 60.0},
 {u'TOT_PTS_Misc': u'Carter III, Laymon', u'Total_Points': 60.0},
 {u'TOT_PTS_Misc': u'Coleman, Johnathan', u'Total_Points': 60.0},
 {u'TOT_PTS_Misc': u'Venditti, Nick', u'Total_Points': 60.0},
 {u'TOT_PTS_Misc': u'Blackwell, Devon', u'Total_Points': 60.0},
 {u'TOT_PTS_Misc': u'Kovach, Alex', u'Total_Points': 60.0},
 {u'TOT_PTS_Misc': u'Bolden, Antonio', u'Total_Points': 60.0},
 {u'TOT_PTS_Misc': u'Smith, Ryan', u'Total_Points': 60.0}]

and I need to use a multi key sort reversed by Total_Points, then not reversed by TOT_PTS_Misc.

This can be done at the command prompt like so:

a = sorted(b, key=lambda d: (-d['Total_Points'], d['TOT_PTS_Misc']))

But I have to run this through a function, where I pass in the list and the sort keys. For example, def multikeysort(dict_list, sortkeys):.

How can the lambda line be used which will sort the list, for an arbitrary number of keys that are passed in to the multikeysort function, and take into consideration that the sortkeys may have any number of keys and those that need reversed sorts will be identified with a '-' before it?

+6  A: 
def sortkeypicker(keynames):
    negate = set()
    for i, k in enumerate(keynames):
        if k[:1] == '-':
            keynames[i] = k[1:]
            negate.add(k[1:])
    def getit(adict):
       composite = [adict[k] for k in keynames]
       for i, (k, v) in enumerate(zip(keynames, composite)):
           if k in negate:
               composite[i] = -v
       return composite
    return getit

a = sorted(b, key=sortkeypicker(['-Total_Points', 'TOT_PTS_Misc']))
Alex Martelli
Wow! That is awesome. It works great. I am such a newbie that I feel I will never get to the point of knowing all this. That was fast too. Thank you very much.
simi
But, what if the keys sent to the sortkeypicker is a string, like so '-Total_Points,TOT_PTS_Misc'?
simi
Then you could split the string into an array first by calling `some_string.split(",")`
Jason Creighton
Thank you. I realized that I can do split of the string, after I already commented. DOH!
simi
A: 
from operator import itemgetter
from functools import partial

def _neg_itemgetter(key, d):
    return -d[key]

def key_getter(key_expr):
    keys = key_expr.split(",")
    getters = []
    for k in keys:
        k = k.strip()
        if k.startswith("-"):
           getters.append(partial(_neg_itemgetter, k[1:]))
        else:
           getters.append(itemgetter(k))

    def keyfunc(dct):
        return [kg(dct) for kg in getters]

    return keyfunc

def multikeysort(dict_list, sortkeys):
    return sorted(dict_list, key = key_getter(sortkeys)

Demonstration:

>>> multikeysort([{u'TOT_PTS_Misc': u'Utley, Alex', u'Total_Points': 60.0},
                 {u'TOT_PTS_Misc': u'Russo, Brandon', u'Total_Points': 96.0}, 
                 {u'TOT_PTS_Misc': u'Chappell, Justin', u'Total_Points': 96.0}],
                "-Total_Points,TOT_PTS_Misc")
[{u'Total_Points': 96.0, u'TOT_PTS_Misc': u'Chappell, Justin'}, 
 {u'Total_Points': 96.0, u'TOT_PTS_Misc': u'Russo, Brandon'}, 
 {u'Total_Points': 60.0, u'TOT_PTS_Misc': u'Utley, Alex'}]

The parsing is a bit fragile, but at least it allows for variable number of spaces between the keys.

Torsten Marek
Would I use this with sorted(dict_list, key=key_getter)? It is not giving me the correct sorted list. When I called multikeysort(dict_list, sortkeys), the dict_list is a list of dictionaries, similar to what is listed above, and the sortkeys is a string, like "-Total_Points,TOT_PTS_Misc". So the output needs to sort by the first key in reverse and then by second key.
simi
cf. the updated answer.
Torsten Marek
This works great! Thank you. I realized I could do a split of the keys after I commented.... DOH!
simi
But, when I have the second item in the string with a '-', it gives me a bad operand type for unary - error.
simi
You can't take the negative of a string.
Torsten Marek
Yes, I know, but this is how the parameters are passed in. Even if I do a split, one or the other will start with '-'. I think the sortkeys need to be split before calling key_getter, that way each item in the keys list will check the first character. Am I on the right track?
simi
+4  A: 

This answer works for any kind of column in the dictionary -- the negated column need not be a number.

def multikeysort(items, columns):
    from operator import itemgetter
    comparers = [ ((itemgetter(col[1:].strip()), -1) if col.startswith('-') else (itemgetter(col.strip()), 1)) for col in columns]  
    def comparer(left, right):
        for fn, mult in comparers:
            result = cmp(fn(left), fn(right))
            if result:
                return mult * result
        else:
            return 0
    return sorted(items, cmp=comparer)

You can call it like this:

b = [{u'TOT_PTS_Misc': u'Utley, Alex', u'Total_Points': 96.0},
 {u'TOT_PTS_Misc': u'Russo, Brandon', u'Total_Points': 96.0},
 {u'TOT_PTS_Misc': u'Chappell, Justin', u'Total_Points': 96.0},
 {u'TOT_PTS_Misc': u'Foster, Toney', u'Total_Points': 80.0},
 {u'TOT_PTS_Misc': u'Lawson, Roman', u'Total_Points': 80.0},
 {u'TOT_PTS_Misc': u'Lempke, Sam', u'Total_Points': 80.0},
 {u'TOT_PTS_Misc': u'Gnezda, Alex', u'Total_Points': 78.0},
 {u'TOT_PTS_Misc': u'Kirks, Damien', u'Total_Points': 78.0},
 {u'TOT_PTS_Misc': u'Worden, Tom', u'Total_Points': 78.0},
 {u'TOT_PTS_Misc': u'Korecz, Mike', u'Total_Points': 78.0},
 {u'TOT_PTS_Misc': u'Swartz, Brian', u'Total_Points': 66.0},
 {u'TOT_PTS_Misc': u'Burgess, Randy', u'Total_Points': 66.0},
 {u'TOT_PTS_Misc': u'Smugala, Ryan', u'Total_Points': 66.0},
 {u'TOT_PTS_Misc': u'Harmon, Gary', u'Total_Points': 66.0},
 {u'TOT_PTS_Misc': u'Blasinsky, Scott', u'Total_Points': 60.0},
 {u'TOT_PTS_Misc': u'Carter III, Laymon', u'Total_Points': 60.0},
 {u'TOT_PTS_Misc': u'Coleman, Johnathan', u'Total_Points': 60.0},
 {u'TOT_PTS_Misc': u'Venditti, Nick', u'Total_Points': 60.0},
 {u'TOT_PTS_Misc': u'Blackwell, Devon', u'Total_Points': 60.0},
 {u'TOT_PTS_Misc': u'Kovach, Alex', u'Total_Points': 60.0},
 {u'TOT_PTS_Misc': u'Bolden, Antonio', u'Total_Points': 60.0},
 {u'TOT_PTS_Misc': u'Smith, Ryan', u'Total_Points': 60.0}]

a = multikeysort(b, ['-Total_Points', 'TOT_PTS_Misc'])
for item in a:
    print item

Try it with either column negated. You will see the sort order reverse.

Next: change it so it does not use extra class....

hughdbrown
This works the best because I can use the reverse on any keys or columns. Thank you!
simi
So this does work well. I call my function with the list and string as parameters. I split the string first then call the multikeysort with the list and the list of keys from the split string. It does not matter which item in the string has the '-' at the start of the column name, because it will work with either item or all items. Awesome. Thank you.
simi
Let's say that there are items in the list of dicts (b) which do not have the keys that I would like to sort by. How would I test for them? I tried a try/except, but it's not returning anything.
simi
You have a list of dictionaries, some with all the columns required, some without? Or sometimes you want to use this on a list of dictionaries and you pass in a list of columns to sort on and the dictionaries are totally of the wrong type?
hughdbrown
In the example, I was only using the list of dictionaries that contained the required columns, or rather, the columns I want to sort by. But, the actual data, and I guess I should have made that part of the example, has dictionaries with various columns, some of which may not even part of the two columns I need to sort by. I use a filter to give me a list of dictionaries that have at least one of the columns I sort by, I just want to make sure I take into consideration all different types of scenarios. Some columns may even have None. Sorry if I am confusing everything.
simi
A: 

Since you're already comfortable with lambda, here's a less verbose solution.

>>> def itemgetter(*names):
    return lambda mapping: tuple(-mapping[name[1:]] if name.startswith('-') else mapping[name] for name in names)

>>> itemgetter('a', '-b')({'a': 1, 'b': 2})
(1, -2)
Coady
This does not work. I have:values = ['-Total_Points', 'TOT_PTS_Misc']then b as the list of dictsWhen I call g = itemgetter(values)(b) I get AttributeError: 'list' object has no attribute 'startswith'
simi
It takes a variable number of names, not a list of names. Call it like this: itemgetter(*values). Have a look at the similar builtin operator.itemgetter for another example.
Coady
A: 

I use the following for sorting a 2d array on a number of columns

def k(a,b):
    def _k(item):
        return (item[a],item[b])
    return _k

This could be extended to work on an arbitrary number of items. I tend to think finding a better access pattern to your sortable keys is better than writing a fancy comparator.

>>> data = [[0,1,2,3,4],[0,2,3,4,5],[1,0,2,3,4]]
>>> sorted(data, key=k(0,1))
[[0, 1, 2, 3, 4], [0, 2, 3, 4, 5], [1, 0, 2, 3, 4]]
>>> sorted(data, key=k(1,0))
[[1, 0, 2, 3, 4], [0, 1, 2, 3, 4], [0, 2, 3, 4, 5]]
>>> sorted(a, key=k(2,0))
[[0, 1, 2, 3, 4], [1, 0, 2, 3, 4], [0, 2, 3, 4, 5]]
mumrah