ansaurus

Question

Positional Rankings and Dealing with Ties in Python

Answer 1

+2 A:

The way to do this is not to calculate the element's position is some arbitrary sequence, but rather to calculate how many other elements have a better score.

EDIT:

By popular demand, O(n)'ed and everything:

positions = {}
cur_score = None # Score we're examining
cur_count = 0 # Number of others that we've seen with this score

for ix, (name, score) in enumerate(sorted_scores):
  if score == cur_score: # Same score for this player as previous
    cur_count += 1
  else: # Different score from before
    cur_score = score
    cur_count = 0
  positions[name] = ix - cur_count + 1 # Add 1 because ix is 0-based

print positions

Ignacio Vazquez-Abrams 2010-02-19 23:16:33

I don't know why this was voted down, as this suggestion is extremely reasonable and correct. It's an invariant that every rank value in a list of "ranks with ties" is exactly the number of better scores (+1, usually). Depending on the strategy for solving the problem, Ignacio's suggestion has extra value as it is perfectly parallelizable and--as such--wonderfully suited to multi-threaded/multi-process solutions.

Travis Bradshaw 2010-02-22 05:45:36

Well it wasn't me ... but the reasons may well have included "no code".

John Machin 2010-02-22 06:04:59

Answer 2

A:

I'm having to make a bunch of assumptions about what it is you want to do, but here's an attempt:

scores = {
    'lorem': 100,
    'ipsum': 200,
    'dolor': 300,
    'sit': 300,
    'amet': 300,
    'quia': 400,
    'consectetur': 500,
    'adipiscing': 500,
    'elit': 600,
    }

groups = {}
for (member, score) in scores.items():
    if score not in groups:
        groups[score] = [member]
    else:
        groups[score].append(member)

positions = {}
for (rank, (score, members)) in enumerate(groups.items()):
    for member in members:
        positions[member] = rank

Showing my working:

>>> import pprint
>>> scores = {
...     'lorem': 100,
...     'ipsum': 200,
...     'dolor': 300,
...     'sit': 300,
...     'amet': 300,
...     'quia': 400,
...     'consectetur': 500,
...     'adipiscing': 500,
...     'elit': 600,
...     }
>>> groups = {}
>>> for (member, score) in scores.items():
...     if score not in groups:
...         groups[score] = [member]
...     else:
...         groups[score].append(member)
...
>>> pprint.pprint(groups)
{100: ['lorem'],
 200: ['ipsum'],
 300: ['sit', 'dolor', 'amet'],
 400: ['quia'],
 500: ['consectetur', 'adipiscing'],
 600: ['elit']}
>>> positions = {}
>>> for (rank, (score, members)) in enumerate(groups.items()):
...     for member in members:
...         positions[member] = rank
...
>>> pprint.pprint(positions)
{'adipiscing': 4,
 'amet': 2,
 'consectetur': 4,
 'dolor': 2,
 'elit': 5,
 'ipsum': 1,
 'lorem': 0,
 'quia': 3,
 'sit': 2}
>>> pprint.pprint(sorted(positions.items(), key=lambda i: i[1]))
[('lorem', 0),
 ('ipsum', 1),
 ('sit', 2),
 ('dolor', 2),
 ('amet', 2),
 ('quia', 3),
 ('consectetur', 4),
 ('adipiscing', 4),
 ('elit', 5)]

bignose 2010-02-22 01:03:09

You import `itertools` but never use it. Try putting stand-alone code in a file and running it from the command line.

John Machin 2010-02-22 03:17:06

Much worse: It doesn't work. Your single correct result is an accident. Change the score of elit from 600 to 6000 (still running last on the low-score-is-better regime) and see what happens. You are assuming incorrectly that groups.items() is sorted on score.

John Machin 2010-02-22 03:35:20

Answer 3

+2 A:

=== Update after change/clarification of specs ===

# coding: ascii

def ranks_from_scores(sorted_scores):
    """sorted_scores: a list of tuples (object_id, score), sorted by score DESCENDING
       return a mapping of object IDs to ranks
    """
    ranks = {}
    previous_score = object()
    for index, (obj_id, score) in enumerate(sorted_scores):
        if score != previous_score:
            previous_score = score
            rank = index + 1
        ranks[obj_id] = rank
    return ranks

from operator import itemgetter
import pprint

scores0 = dict([
    ('Apolo Ohno', 0),
    ('Shanie Davis', -1),
    ('Bodie Miller', -2),
    ('Lindsay Vohn', -3),
    ('Shawn White', -3)
    ])

scores1 = {
    'lorem': 100,
    'ipsum': 200,
    'dolor': 300,
    'sit': 300,
    'amet': 300,
    'quia': 400,
    'consectetur': 500,
    'adipiscing': 500,
    'elit': 600,
    }

scores2 = {
    'lorem': 100,
    'ipsum': 200,
    'dolor': 300,
    'sit': 300,
    'amet': 300,
    'quia': 400,
    'consectetur': 500,
    'adipiscing': 500,
    'elit': 6000,
    }

import pprint
funcs = (ranks_from_scores, ) # Watch this space!
tests = (scores0, scores1, scores2)

for test in tests:
    print
    test_list = sorted(test.items(), key=itemgetter(1), reverse=True)
    print "Input:", test_list
    for func in funcs:
        result = func(test_list)
        print "%s ->" % func.__name__
        pprint.pprint(result)

Results:

Input: [('Apolo Ohno', 0), ('Shanie Davis', -1), ('Bodie Miller', -2), ('Lindsay
 Vohn', -3), ('Shawn White', -3)]
ranks_from_scores ->
{'Apolo Ohno': 1,
 'Bodie Miller': 3,
 'Lindsay Vohn': 4,
 'Shanie Davis': 2,
 'Shawn White': 4}

Input: [('elit', 600), ('consectetur', 500), ('adipiscing', 500), ('quia', 400),
 ('dolor', 300), ('sit', 300), ('amet', 300), ('ipsum', 200), ('lorem', 100)]
ranks_from_scores ->
{'adipiscing': 2,
 'amet': 5,
 'consectetur': 2,
 'dolor': 5,
 'elit': 1,
 'ipsum': 8,
 'lorem': 9,
 'quia': 4,
 'sit': 5}

Input: [('elit', 6000), ('consectetur', 500), ('adipiscing', 500), ('quia', 400)
, ('dolor', 300), ('sit', 300), ('amet', 300), ('ipsum', 200), ('lorem', 100)]
ranks_from_scores ->
{'adipiscing': 2,
 'amet': 5,
 'consectetur': 2,
 'dolor': 5,
 'elit': 1,
 'ipsum': 8,
 'lorem': 9,
 'quia': 4,
 'sit': 5}

=== original submission ===

This code assumes that you really want the highest score to get rank 1, not the lowest score getting rank 1 (or 0!).

# coding: ascii

def ranks_from_scores(scores, debug=False):
    """scores (a mapping of object IDs to scores)
       return a mapping of object IDs to ranks
    """
    alist = [(v, k) for k, v in scores.items()]
    alist.sort(reverse=True)
    if debug: print 'alist:', alist
    bdict = {}
    previous_score = object()
    for posn, (score, obj_id) in enumerate(alist):
        if score != previous_score:
            previous_score = score
            rank = posn + 1
        bdict[obj_id] = rank
    if debug:
        print 'bdict:', bdict
        blist = [(v, k) for k, v in bdict.items()]
        print 'blist:', sorted(blist)
    return bdict

ranks_from_scores(
    {'q': 10, 'w': 20, 'e': 20, 'r': 20, 't': 30},
    debug=True,
    )

Output:

alist: [(30, 't'), (20, 'w'), (20, 'r'), (20, 'e'), (10, 'q')]
bdict: {'q': 5, 'r': 2, 'e': 2, 't': 1, 'w': 2}
blist: [(1, 't'), (2, 'e'), (2, 'r'), (2, 'w'), (5, 'q')]

John Machin 2010-02-22 03:35:52

Answer 4

+1 A:

Looks like you can use the sorted and enumerate builtins, the groupby method from itertools and the itemgetter method from operator. Assumes higher scores are better... (if lower scores are better, change reverse=True to reverse=False)

>>> from itertools import groupby
>>> from operator import itemgetter
>>> scores = {
...     'lorem': 100,
...     'ipsum': 200,
...     'dolor': 300,
...     'sit': 300,
...     'amet': 300,
...     'quia': 400,
...     'consectetur': 500,
...     'adipiscing': 500,
...     'elit': 600,
...     }
>>> sorted_items = sorted(scores.items(), key=itemgetter(1), reverse=True)
>>> groups = groupby(sorted_items, itemgetter(1))
>>> for rank, (score, items) in enumerate(groups):
...     print rank+1, map(itemgetter(0), items)
... 
1 ['elit']
2 ['consectetur', 'adipiscing']
3 ['quia']
4 ['dolor', 'sit', 'amet']
5 ['ipsum']
6 ['lorem']

jgeewax 2010-02-22 03:36:51

Doesn't appear to do what the OP wants: note his example of "more sensible" output `positions = {'1': 1, '2': 2, '3': 2, '4': 2, '5': 5, '6': 6}` i.e. one gets first prize, three share second prize, the fifth is rank FIVE; you would have the fifth as rank THREE.

John Machin 2010-02-22 04:26:32

Answer 5

A:

Here is a simple way to do it

last = None
position = 0
delta = 1
for key, value in sorted_list:
    if value is not None:
        if value != last:
            position += delta
            delta = 1
        else:
            delta += 1
        # i believe this is supposed to be [key] not [value] in OP's code
        positions[key] = position
        last = value

vishvananda 2010-02-22 05:20:34

If there is a two way tie for 4th place, I think the next place should be 6th

gnibbler 2010-02-22 06:10:40

@gnibbler: What you say has a serendipitous combination of two attributes (a) it's sensible (b) it's what the OP expressly wants.

John Machin 2010-02-22 06:24:52

@gnibbler: Good point. I fixed the code according to your suggestion

vishvananda 2010-02-22 08:30:30

Answer 6

A:

>>> sorted_scores = [
...     ('Apolo Ohno', 0),
...     ('Shanie Davis', -1),
...     ('Bodie Miller', -2),
...     ('Lindsay Vohn', -3),  
...     ('Shawn White', -3),
...     ('Bryan Veloso',-4)
... ]
>>> 
>>> from itertools import groupby
>>> from operator import itemgetter
>>> 
>>> place=1
>>> res={}
>>> for _,items in groupby(sorted_scores,key=itemgetter(1)):
...     for i,item in enumerate(items):
...         res[item[0]]= place
...     place+=i+1
... 
>>> print res
{'Apolo Ohno': 1, 'Bryan Veloso': 6, 'Shanie Davis': 2, 'Lindsay Vohn': 4, 'Bodie Miller': 3, 'Shawn White': 4}

Remember that dicts are unordered, so to iterate in order of place, you need to do this

>>> print sorted(res.items(),key=itemgetter(1))
[('Apolo Ohno', 1), ('Shanie Davis', 2), ('Bodie Miller', 3), ('Lindsay Vohn', 4), ('Shawn White', 4), ('Bryan Veloso', 6)]

gnibbler 2010-02-22 05:59:57

Answer 7

+1 A:

Solution

Here's one simple way to do it by modifying your code a little rather than importing modules:

prev = None
rank = 0
incr = 1
positions = {}
for key, value in sorted_list:
    if value is not None:
        if value != prev:
            rank += incr
            incr = 1
        else:
            incr += 1
        positions[key] = rank
        prev = value

A Test

For

sorted_list = [
    ('Apolo Ohno', 0),
    ('Shanie Davis', -1),
    ('Bodie Miller', -2),
    ('Lindsay Vohn', -3),  
    ('Shawn White', -3),
    ('Bryan Veloso',-4)
]

I get positions as:

{'Apolo Ohno': 1, 
'Shanie Davis': 2,
 'Bodie Miller': 3,
 'Lindsay Vohn': 4,
 'Shawn White': 4,
 'Bryan Veloso': 6}

which I think is what you are going for even though you aren't quite clear about whether there should be a 6 after the two 4's.

Justin Peel 2010-02-22 06:01:23

Indeed, there should be a 6. I edited my question to reflect that.

Bryan Veloso 2010-02-22 06:52:33

This appears to be an exact duplicate of my answer. I guess great minds think alike?

vishvananda 2010-02-23 08:02:15

@vishvananda: It seems to be endemic :-(

John Machin 2010-02-23 09:16:10

@vishvananda Yours was incorrect when I posted mine, but then you edited yours.

Justin Peel 2010-02-23 15:12:47

@Justin: I see. I guess I missed it. :)

vishvananda 2010-02-23 17:43:12

Answer 8

+3 A:

>>> sorted_scores = [
...     ('Apolo Ohno', 0),
...     ('Shanie Davis', -1),
...     ('Bodie Miller', -2),
...     ('Lindsay Vohn', -3),  
...     ('Shawn White', -3),
...     ('Bryan Veloso',-4)
... ]
>>> 
>>> res = {}
>>> prev = None
>>> for i,(k,v) in enumerate(sorted_scores):
...     if v!=prev:
...         place,prev = i+1,v
...     res[k] = place
... 
>>> print res
{'Apolo Ohno': 1, 'Bryan Veloso': 6, 'Shanie Davis': 2, 'Lindsay Vohn': 4, 'Bodie Miller': 3, 'Shawn White': 4}

Remember that dicts are unordered, so to iterate in order of place, you need to do this

>>> from operator import itemgetter
>>> print sorted(res.items(),key=itemgetter(1))
[('Apolo Ohno', 1), ('Shanie Davis', 2), ('Bodie Miller', 3), ('Lindsay Vohn', 4), ('Shawn White', 4), ('Bryan Veloso', 6)]

gnibbler 2010-02-22 06:06:54

much shorter than my solution. Good call using enumerate.

vishvananda 2010-02-22 08:35:23

@vishvananda: Please also do a critique on my (earlier) answer :-)

John Machin 2010-02-22 13:04:47

@John: Sorry john, I missed the fact that your answer uses the same technique.

vishvananda 2010-02-23 08:11:36

ansaurus

tags:

views:

answers:

Positional Rankings and Dealing with Ties in Python

related questions