views:

96

answers:

5

Several times (even several in a row) I've been bitten by the defaultdict bug: forgetting that something is actually a defaultdict and treating it like a regular dictionary.

d = defaultdict(list)

...

try:
  v = d["key"]
except KeyError:
  print "Sorry, no dice!"

For those who have been bitten too, the problem is evident: when d has no key 'key', the v = d["key"] magically creates an empty list and assigns it to both d["key"] and v instead of raising an exception. Which can be quite a pain to track down if d comes from some module whose details one doesn't remember very well.

I'm looking for a way to take the sting out of this bug. For me, the best solution would be to somehow disable a defaultdict's magic before returning it to the client.

+4  A: 

use different idiom:

if 'key' not in d:
    print "Sorry, no dice!"
SilentGhost
That's asking for permission, however.
badp
@bp: so, what??
SilentGhost
+1 I think testing with `in` is much better style than `except KeyError`, not just for defaultdicts.
THC4k
I would like to agree, but using `in` does mean you end up with two dictionary lookups, one for the key check and one for the actual data retrieval.Also the check gets more and more elaborate the more levels of nesting there are. `v = d[k1][k2][k3]` vs. `if k1 in d:` \ ` tmp1 = d[k1]` \ ` if k2 in tmp1:` \ ` tmp2 = tmp1[k2]` \ ` if k3 in tmp2:` etcetera.Also, having clients which only want to read from d deal with implementation details like whether d is a defaultdict or not seems not very appealing to me.
@bp - I could be wrong, but doesn't `in` make the attempt and catch the exception behind the scenes?
detly
+5  A: 

You may still convert it to an normal dict.

d = collections.defaultdict(list)
d = dict(d)
evilpie
+1  A: 

You can prevent creation of default values by assigning d.default_factory = None. However, I don't quite like the idea of object suddenly changing behavior. I'd prefer copying values to the new dict unless it imposes severe performance penalty.

rkhayrov
Thanks a lot for putting me on the right track with this. I don't have any particular problems with changing the behavior of this kind of object because I believe that checking whether an object supports a particular protocol is much better than checking its type. In this case simply cheching whether it supports the mapping protocol is far better than defensively checking whether a dictionary could potentially be a defaultdict and changing client behavior accordingly.
+2  A: 

That is exactly the behavior you want from a defaultdict and not a bug. If you dont't want it, dont use a defaultdict.

If you keep forgetting what type variables have, then name them appropriately - for example suffix your defaultdict names with "_ddict".

THC4k
I was hoping that I could forget about this Hungarian twist by using Python :-) I'd rather like to name a potential duck if it quacks than propagate a name change to lots of code if I decide at one point to turn d into a shelve for example...
A: 

Using rkhayrov's idea of resetting self.default_factory, here is a toggleable subclass of defaultdict:

class ToggleableDefaultdict(collections.defaultdict):
    def __init__(self,default_factory):
        self._default_factory=default_factory
        super(ToggleableDefaultdict,self).__init__(default_factory)
    def off(self):
        self.default_factory=None
    def on(self):
        self.default_factory=self._default_factory

For example:

d=ToggleableDefaultdict(list)
d['key'].append(1)
print(d)
# defaultdict(<type 'list'>, {'key': [1]})

d.off()
d['newkey'].append(2)
# KeyError: 'newkey'

d.on()
d['newkey'].append(2)
# defaultdict(<type 'list'>, {'newkey': [2], 'key': [1]})
unutbu