views:

597

answers:

4

Hi,

I'm new Python and trying to implement code in a more Pythonic and efficient fashion. Given a dictionary with numeric keys and values, what is the best way to find the largest key with a non-zero value?

Thanks

+6  A: 

Something like this should be reasonably fast:

>>> x = {0: 5, 1: 7, 2: 0}
>>> max(k for k, v in x.iteritems() if v != 0)
1

(removing the != 0 will be slightly faster still, but obscures the meaning somewhat.)

Nicholas Riley
Since the OP is new, a description of what is happening might be helpful as well.
Brian C. Lane
Note that in Python 3.x `.iteritems` no longer exists and `.items` returns an iterator. (Unlike in Python 2.x, where `.items` returns a list and `.iteritems` returns an iterator.)
Stephan202
What is happening here? We are calling max() to find the largest key. What we pass to max() is a "generator expression", very similar to a "list comprehension". max() will repeatedly get values for k, and it will pick the largest. The generator expression will only return k values when the v value is not zero. The k and v values come from x.iteritems(), which returns key, value pairs. This code will work in Python 2.4 and newer, but as Stephan202 noted, for Python 3.x you need to replace "iteritems" with just "items".
steveha
The nice thing about generator expressions (and iterators) is that they give values one at a time, rather than building some large list object that will then be immediately destroyed. It would be equally correct, but somewhat wasteful, to build a list of key, value tuples, and then compute the max() from that.
steveha
A: 

Python's max function takes a key= parameter for a "measure" function.

data = {1: 25, 0: 75}
def keymeasure(key):
    return data[key] and key

print max(data, key=keymeasure)

Using an inline lambda to the same effect and same binding of local variables:

print max(data, key=(lambda k: data[k] and k))

last alternative to bind in the local var into the anonymous key function

print max(data, key=(lambda k, mapping=data: mapping[k] and k))
kaizer.se
That function depends on access to the global. Bad idea.
Brian C. Lane
No it doesn't. That only depends on having access to the same scope. All of that can be inside of a function scope and it would still work.
Andrew Dalke
@dalke, the point remains that the function should take the dictionary as an argument, rather than hard-coding the name of the dict.
steveha
the lambda I edited in has exactly the same behavior
kaizer.se
@steveha: there are other reasons not to use this solution. For one, it doesn't handle {0:0, -1:-1} because it's doesn't actually ignore zero values. But saying it "depends on access to the global" isn't one of them.
Andrew Dalke
+4  A: 

To get the largest key, you can use the max function and inspect the keys like this:

max(x.iterkeys())

To filter out ones where the value is 0, you can use a generator expression:

(k for k, v in x.iteritems() if v != 0)

You can combine these to get what you are looking for (since max takes only one argument, the parentheses around the generator expression can be dropped):

max(k for k, v in x.iteritems() if v != 0)
Corey Goldberg
Almost there! Finally, you remove the square braces and you are left with the best solution. The square braces make a list comprehension, which builds the whole list, and then the whole list is passed to max(). Leaving off the square braces, you get a generator expression, which passes values one at a time to max(). For a small number of items it's no big deal, but for very large dictionaries, the extra effort to build a list and then destroy it can be considerable.
steveha
I just updated my answer... switched from lists to generators/iterators
Corey Goldberg
Just FYI, you don't need the extra parens. The parens of max() can do double duty: they can be the parens for the function call to max() and they can also be the parens around the generator expression. Try it! :-)
steveha
A: 

If I were you and speed was a big concern, I'd probably create a new container class "DictMax" that'd keep track of it's largest non-zero value elements by having an internal stack of indexes, where the top element of the stack is always the key of the largest element in the dictionary. That way you'd get the largest element in constant time everytime.

Gubatron