tags:

views:

240

answers:

5

Hi everyone,

I'm currently re-engaging with Python after a long absence and loving it. However, I find myself coming across a pattern over and over. I keep thinking that there must be a better way to express what I want and that I'm probably doing it the wrong way.

The code that I'm writing is in the following form:

# foo is a dictionary
if foo.has_key(bar):
  foo[bar] += 1
else:
  foo[bar] = 1

I'm writing this a lot in my programs. My first reaction is to push it out to a helper function, but so often the python libraries supply things like this already.

Is there some simple little syntax trick that I'm missing? Or is this the way that it should be done?

+16  A: 

Use a defaultdict:

from collections import defaultdict

foo = defaultdict(int)
foo[bar] += 1

In Python >= 2.7, you also have a separate Counter class for these purposes. For Python 2.5 and 2.6, you can use its backported version.

Tamás
`collections.Counter` is in `2.7` http://docs.python.org/dev/whatsnew/2.7.html#new-and-improved-modules
J.F. Sebastian
Thanks, I'm fixing the answer.
Tamás
Thanks! I didn't know about defaultdict. That's exactly what I was looking for.
cursa
A: 

For Python >= 2.5 you can do the following:

foo[bar] = 1 if bar not in foo else foo[bar]+1
thetaiko
While valid, this is not any more concise or readable than the OP's code.
musicfreak
+2  A: 

You can also take advantage of the control structure in exception handling. A KeyError exception is thrown by a dictionary when you try to assign a value to a non-existent key:

my_dict = {}
try:
    my_dict['a'] += 1
except KeyError, err:    # in 2.6: `except KeyError as err:`
    my_dict['a'] = 1
Santa
Just because exception handling *can* be used for control flow doesn't mean that it should.
Corey Porter
AFAIK, doing something like dict.has_key(key) actually tries to access the key and returns False if an exception is caught.
detly
+5  A: 

The dict's get() method takes an optional second parameter that can be used to provide a default value if the requested key is not found:

foo[bar] = foo.get(bar, 0) + 1
sth
Why the downvote? It's valid and readable
Wallacoloo
I didn't vote it down, but I guess the original downvoter did that because it violates the DRY (Don't Repeat Yourself) principle: "foo" and "bar" are both mentioned twice.
Tamás
@Tamas: Well, the OPs version mentions each of those three times :)
truppo
Yeah, that's even worse :)
Tamás
A: 

I did some time comparisons. Pretty much equal. The one-lined .get() command is fastest, though.

Output:

get 0.543551800627
exception 0.587318710994
haskey 0.598421703081

Code:

import timeit
import random

RANDLIST = [random.randint(0, 1000) for i in range(10000)]

def get():
    foo = {}
    for bar in RANDLIST:
        foo[bar] = foo.get(bar, 0) + 1


def exception():
    foo = {}
    for bar in RANDLIST:
        try:
            foo[bar] += 1
        except KeyError:
            foo[bar] = 1


def haskey():
    foo = {}
    for bar in RANDLIST:
        if foo.has_key(bar):
            foo[bar] += 1
        else:
            foo[bar] = 1


def main():
    print 'get', timeit.timeit('get()', 'from __main__ import get', number=100)
    print 'exception', timeit.timeit('exception()', 'from __main__ import exception', number=100)
    print 'haskey', timeit.timeit('haskey()', 'from __main__ import haskey', number=100)


if __name__ == '__main__':
    main()
Brendan Abel