ansaurus

Question

Efficient way to find the largest key in a dictionary with non-zero value

Answer 1

+6 A:

Something like this should be reasonably fast:

>>> x = {0: 5, 1: 7, 2: 0}
>>> max(k for k, v in x.iteritems() if v != 0)
1

(removing the != 0 will be slightly faster still, but obscures the meaning somewhat.)

Nicholas Riley 2009-10-12 17:51:55

Since the OP is new, a description of what is happening might be helpful as well.

Brian C. Lane 2009-10-12 18:03:06

Note that in Python 3.x `.iteritems` no longer exists and `.items` returns an iterator. (Unlike in Python 2.x, where `.items` returns a list and `.iteritems` returns an iterator.)

Stephan202 2009-10-12 18:22:30

What is happening here? We are calling max() to find the largest key. What we pass to max() is a "generator expression", very similar to a "list comprehension". max() will repeatedly get values for k, and it will pick the largest. The generator expression will only return k values when the v value is not zero. The k and v values come from x.iteritems(), which returns key, value pairs. This code will work in Python 2.4 and newer, but as Stephan202 noted, for Python 3.x you need to replace "iteritems" with just "items".

steveha 2009-10-12 18:46:29

The nice thing about generator expressions (and iterators) is that they give values one at a time, rather than building some large list object that will then be immediately destroyed. It would be equally correct, but somewhat wasteful, to build a list of key, value tuples, and then compute the max() from that.

steveha 2009-10-12 18:48:05

Answer 2

A:

Python's max function takes a key= parameter for a "measure" function.

data = {1: 25, 0: 75}
def keymeasure(key):
    return data[key] and key

print max(data, key=keymeasure)

Using an inline lambda to the same effect and same binding of local variables:

print max(data, key=(lambda k: data[k] and k))

last alternative to bind in the local var into the anonymous key function

print max(data, key=(lambda k, mapping=data: mapping[k] and k))

kaizer.se 2009-10-12 17:52:35

That function depends on access to the global. Bad idea.

Brian C. Lane 2009-10-12 18:00:55

No it doesn't. That only depends on having access to the same scope. All of that can be inside of a function scope and it would still work.

Andrew Dalke 2009-10-12 18:28:11

@dalke, the point remains that the function should take the dictionary as an argument, rather than hard-coding the name of the dict.

steveha 2009-10-12 18:42:11

the lambda I edited in has exactly the same behavior

kaizer.se 2009-10-12 19:15:40

@steveha: there are other reasons not to use this solution. For one, it doesn't handle {0:0, -1:-1} because it's doesn't actually ignore zero values. But saying it "depends on access to the global" isn't one of them.

Andrew Dalke 2009-10-12 19:41:14

Answer 3

+4 A:

To get the largest key, you can use the max function and inspect the keys like this:

max(x.iterkeys())

To filter out ones where the value is 0, you can use a generator expression:

(k for k, v in x.iteritems() if v != 0)

You can combine these to get what you are looking for (since max takes only one argument, the parentheses around the generator expression can be dropped):

max(k for k, v in x.iteritems() if v != 0)

Corey Goldberg 2009-10-12 18:09:11

Almost there! Finally, you remove the square braces and you are left with the best solution. The square braces make a list comprehension, which builds the whole list, and then the whole list is passed to max(). Leaving off the square braces, you get a generator expression, which passes values one at a time to max(). For a small number of items it's no big deal, but for very large dictionaries, the extra effort to build a list and then destroy it can be considerable.

steveha 2009-10-12 18:20:03

I just updated my answer... switched from lists to generators/iterators

Corey Goldberg 2009-10-12 18:28:28

Just FYI, you don't need the extra parens. The parens of max() can do double duty: they can be the parens for the function call to max() and they can also be the parens around the generator expression. Try it! :-)

steveha 2009-10-12 18:41:28

Answer 4

A:

If I were you and speed was a big concern, I'd probably create a new container class "DictMax" that'd keep track of it's largest non-zero value elements by having an internal stack of indexes, where the top element of the stack is always the key of the largest element in the dictionary. That way you'd get the largest element in constant time everytime.

Gubatron 2009-10-12 19:25:51

ansaurus

tags:

views:

answers:

Efficient way to find the largest key in a dictionary with non-zero value

related questions