views:

49

answers:

4

I hit an interesting python bug today in which instantiating a class repeatedly appears to be holding state. In later instantiation calls the variable is already defined.

I boiled down the issue into the following class/shell interaction. I realize that this is not the best way to initialize a class variable, but it sure should not be behaving like this. Is this a true bug or is this a "feature"? :D

tester.py:

class Tester():
        def __init__(self):
                self.mydict = self.test()

        def test(self,out={}):
                key = "key"
                for i in ['a','b','c','d']:
                        if key in out:
                                out[key] += ','+i
                        else:   
                                out[key] = i 
                return out

Python prompt:

Python 2.6.6 (r266:84292, Oct  6 2010, 00:44:09) 
[GCC 4.2.1 (Apple Inc. build 5664)] on darwin
>>> import tester
>>> t = tester.Tester()
>>> print t.mydict
{'key': 'a,b,c,d'}
>>> t2 = tester.Tester()
>>> print t2.mydict
{'key': 'a,b,c,d,a,b,c,d'}
A: 

Could it be interpreting out as a global variable, thus keeps adding to it whenever you call test?

Assaf Lavie
+1  A: 

In general, default method arguments shouldn't be mutable. Instead do:

def test(self, out=None):
   out = out or {}
   # other code goes here.

See http://stackoverflow.com/questions/1132941/least-astonishment-in-python-the-mutable-default-argument and http://effbot.org/zone/default-values.htm for more details on why this is necessary and why it's a "feature" of the python language rather than a bug.

sdolan
`out = out or {}` not only has a perlish whiff about it but also will fail if the caller passes in their own empty mapping object. `if out is None: out = {}` is preferable.
John Machin
A: 

You are modifying the value of the function keyword parameter out in your method.

This blog post explains it succintly:

expressions in default arguments are calculated when the function is defined, not when it’s called.

The function is defined when the class is created, not for each instance. If you modify it like so, the problem goes away:

def test(self,out=None):
        if out is None:
                out = {}
        key = "key"
        for i in ['a','b','c','d']:
                if key in out:
                        out[key] += ','+i
                else:   
                        out[key] = i 
        return out
detly
+1  A: 

It is a feature that pretty much all Python users run into once or twice. The main usage is for caches and the likes to avoid repetitive lengthy calculations (simple memoizing, really), although I am sure people have found other uses for it.

The reason for this is that the def statement only gets executed once, which is when the function is defined. Thus the initializer value only gets created once. For a reference type (as opposed to an immutable type which cannot change) like a list or a dictionary, this ends up as a visible and surprising pitfall, whereas for value types, it goes unnoticed.

Usually, people work around it like this:

def Test(a = None):
    if a == None:
        a = {}
    # ... etc.
Stigma