views:

122

answers:

5

How does:

dict = {}
if key not in dict:
 dict[key] = foo

Compare to:

try:
 dict[key]
except KeyError:
 dict[key] = foo 

ie, is the look up of a key in anyway faster than the linear search through dict.keys(), that I assume the first form will do?

A: 

my_dict.get(key, foo) returns foo if key isn't in my_dict. The default value is None, so my_dict.get(key) will return None if key isn't in my_dict. The first of your options is better if you want to just add key to your dictionary. Don't worry about speed here. If you find that populating your dictionary is a hot spot in your program, then think about it. But it isn't. So don't.

Nathon
+1 - Very Pythonic.
duffymo
That doesn't set the value if it's not set by looking at his code it appears he's checking if the key exists and setting it otherwise.
GWW
@GWW: True. You could use `dict[key] = dict.get(key, foo)` though.
Nathon
I suppose so, but then why not just use dict.setdefault(key,value)
GWW
@Nathon, that's entirely incorrect. Did you ever try doing a thing called "reading the documentation"?
Aaron Gallagher
+4  A: 

Try: my_dict.setdefault(key, default). It's slightly slower than the other options, though.

If key is in the dictionary, return its value. If not, insert key with a value of default and return default. default defaults to None.

#!/usr/bin/env python

example_dict = dict(zip(range(10), range(10)))

def kn(key, d):
    if key not in d:
        d[key] = 'foo'

def te(key, d):
    try:
        d[key]
    except KeyError:
        d[key] = 'foo'

def sd(key, d):
    d.setdefault(key, 'foo')

if __name__ == '__main__':
    from timeit import Timer

    t = Timer("kn(2, example_dict)", "from __main__ import kn, example_dict")
    print t.timeit()
    t = Timer("te(2, example_dict)", "from __main__ import te, example_dict")
    print t.timeit()
    t = Timer("sd(2, example_dict)", "from __main__ import sd, example_dict")
    print t.timeit()

    # kn: 0.249855041504
    # te: 0.244259119034
    # sd: 0.375113964081
The MYYN
It's quite interesting that the python built in method is so much slower.
GWW
Function call overhead, I suppose.
The MYYN
And it's interesting that with `psyco.full()`, all three variants only take about 10% of the time.
AndiDog
+1 for being the only person to actually answer the question
aaronasterling
Thanks for the timing data! I also tested collections.defaultdict and found it to perform better than setdefault, but worse than kn and te.
Aaron Croyle
+4  A: 

You're looking for the setdefault method:

>>> r = {}
>>> r.setdefault('a', 'b')
'b'
>>> r
{'a': 'b'}
>>> r.setdefault('a', 'e')
'b'
>>> r
{'a': 'b'}
ChrisAdams
+1 for being the first to read the question properly ;)
delnan
+1  A: 

Just to clarify one point: if key not in d doesn't do a linear search through d's keys. It uses the dict's hash table to quickly find the key.

Ned Batchelder
+1  A: 

The answer depends on how often the key is already in the dict (BTW, has anyone mentioned to you how bad an idea it is to hide a builtin such as dict behind a variable?)

if key not in dct:
 dct[key] = foo

If the key is in the dictionary this does one dictionary lookup. If the key is in the dictionary it looks up the dictionary twice.

try:
 dct[key]
except KeyError:
 dct[key] = foo 

This may be slightly faster for the case where the key is in the dictionary, but throwing an exception has quite a big overhead, so it is almost always not the best option.

dct.setdefault(key, foo)

This one is slightly tricky: it always involves two dictionary lookups: the first one is to find the setdefault method in the dict class, the second is to look for key in the dct object. Also if foo is an expression it will be evaluated every time whereas the earlier options only evaluate it when they have to.

Also look at collections.defaultdict. That is the most appropriate solution for a large class of situations like this.

Duncan
Good point on using 'dict' I changed the variable name when typing the example and didn't think about it. The key key is usually not in the dict.
Aaron Croyle
I'll be going with collections.defaultdict, thanks for pointing that out. It seems pythonic, and a hair faster than dict.setdefault()
Aaron Croyle