views:

120

answers:

4

i have a result tuple of dictionaries.

result = ({'name': 'xxx', 'score': 120L }, {'name': 'xxx', 'score': 100L}, {'name': 'yyy', 'score': 10L})

I want to uniqify it. After uniqify operation result = ({'name': 'xxx', 'score': 120L }, {'name': 'yyy', 'score': 10L})

The result contain only one dictionary of each name and the dict should have maximum score. The final result should be in the same format ie tuple of dictionary.

+2  A: 
from operator import itemgetter

names = set(d['name'] for d in result)
uniq = []
for name in names:
    scores = [res for res in result if res['name'] == name]
    uniq.append(max(scores, key=itemgetter('score')))

I'm sure there is a shorter solution, but you won't be able to avoid filtering the scores by name in some way first, then find the maximum for each name.

Storing scores in a dictionary with names as keys would definitely be preferable here.

jellybean
I think you're finding the highest score, not the highest score for each unique name.
Gabe
@Gabe, No I am finding the highest score of each unique name. I slightly modify the question. Check it out.
alis
@Gabe: You're right ... I was misled by the existence of only one name.
jellybean
`operator.itemgetter`
Tony Veijalainen
@Tony: No, `itemgetter` is imported properly.
jellybean
@ jellybean edit delay, I saw unedited first version
Tony Veijalainen
+2  A: 

I would create an intermediate dictionary mapping each name to the maximum score for that name, then turn it back to a tuple of dicts afterwards:

>>> result = ({'name': 'xxx', 'score': 120L }, {'name': 'xxx', 'score': 100L}, {'name': 'xxx', 'score': 10L}, {'name':'yyy', 'score':20})
>>> from collections import defaultdict
>>> max_scores = defaultdict(int)
>>> for d in result: 
...     max_scores[d['name']] = max(d['score'], max_scores[d['name']])
... 
>>> max_scores 
defaultdict(<type 'int'>, {'xxx': 120L, 'yyy': 20})
>>> tuple({name: score} for (name, score) in max_scores.iteritems()) 
({'xxx': 120L}, {'yyy': 20})

Notes: 1) I have added {'name': 'yyy', 'score': 20} to your example data to show it working with a tuple with more than one name.

2)I use a defaultdict that assumes the minimum value for score is zero. If the score can be negative you will need to change the int parameter of defaultdict(int) to a function that returns a number smaller than the minimum possible score.

Incidentally I suspect that having a tuple of dictionaries is not the best data structure for what you want to do. Have you considered alternatives, such as having a single dict, perhaps with a list of scores for each name?

Dave Kirby
+1 for the data structure criticism
Tony Veijalainen
+1  A: 

I would reconsider the data structure to fit your needs better (for example dict hashed with name with list of scores as value), but I would do like this:

import operator as op
import itertools as it

result = ({'name': 'xxx', 'score': 120L },
          {'name': 'xxx', 'score': 100L},
          {'name': 'xxx', 'score': 10L},
          {'name':'yyy', 'score':20})
# groupby

highscores = tuple(max(namegroup, key=op.itemgetter('score'))
                   for name,namegroup in it.groupby(result,
                                                    key=op.itemgetter('name'))
                   )
print highscores
Tony Veijalainen
A: 

How about...

inp  = ({'name': 'xxx', 'score': 120L }, {'name': 'xxx', 'score': 100L}, {'name': 'yyy', 'score': 10L})

temp = {}
for dct in inp:
    if dct['score'] > temp.get(dct['name']): temp[dct['name']] = dct['score']

result = tuple({'name': name, 'score': score} for name, score in temp.iteritems())
Lord British