tags:

views:

128

answers:

5
class Foo(object):
    def __init__(self,x):
        self.x = x
        self.is_bar = False
    def __repr__(self): return str(self.x)

class Bar(object):
    def __init__(self,l = []):
        self.l = l
    def add(self,o):
        self.l += [o]
    def __repr__(self): return str(self.l)

def foo_plus_foo(f1,f2):
    t = Bar()
    if not (f1.is_bar and f2.is_bar):
        f1.is_bar = True
        f2.is_bar = True
        t.add(f1)
        t.add(f2)
        print 'HERE'
    return t

if __name__ == '__main__':
    li = [Foo(1), Foo(2)]
    print foo_plus_foo(li[0],li[1])
    print foo_plus_foo(li[0],li[1])

UNEXPECTED OUTPUT:

HERE
[1, 2]
[1, 2]

EXPECTED OUTPUT:

HERE
[1, 2]
[]

What is happening? What did I do wrong? Why is python using old value? What do I do to avoid this?

Thanks!

A: 

Is it mutable ?

Ankur Gupta
what are you talking about?
TheMachineCharmer
He is talking about you using a mutable default parameter, which is not initialized every time the function is called, but only once when it's declared - this is where your error comes from. Do not use mutables as default parameter, instead, pass None for example and create a new list each time the function is called.
Jim Brissom
@Jim Brissom: But if I use `None` I will have to check for `None` everywhere before doing anything with `self.l`. And actual code is very big. Any other solution?
TheMachineCharmer
@Jim Thanks for explaining better ... TheMachineCharmer This is one mistake that's not intuitive and pretty much everyone makes once in a while ...
Ankur Gupta
@Ankur Gupta: "pretty much everyone makes once". Anyone who makes this mistake "once in a while" needs to find a new career.
S.Lott
@S.Lott uhmm to each his own I guess ... but yes once one knows the difference between mutable and immutable and still makes the same mistake ... yeah I guess they should find a new career.
Ankur Gupta
Python is used by lots of people who are not full-time programmers but who write scripts to accomplish work tasks. Don't assume everyone who is writing a Python program is, or aspires to be, an expert.
kindall
+4  A: 

You are defining l as having a default value of []. This is a classic Python pitfall.

class Bar(object):
    def __init__(self,l = []):

Default values are evaluated at definition time not run-time. It is evaluated only once.

So t=Bar() sets t.l to the very same list every time.

To fix this, change Bar to

class Bar(object):
    def __init__(self,l = None):
        if l is None:
            l=[]
        self.l = l
    def add(self,o):
        self.l += [o]
    def __repr__(self): return str(self.l)
unutbu
but I am also creating new `t = Bar()` everytime.
TheMachineCharmer
Yes, but since you don't supply an argument to `Bar`, the same old list is set every time. This list is getting mutated with each run of `foo_plus_foo`.
unutbu
Thanks!! I fell in the "pit". :D
TheMachineCharmer
+5  A: 

Never. Do. This.

def __init__(self,l = []):

Never.

One list object is reused. And it's Mutable, so that each time it's reused, the one and only [] created in your method definition is updated.

Always. Do. This.

def __init__( self, l= None ):
    if l is None: l = []

That creates a fresh, new, unique list instance.

S.Lott
how about `self.l = [] if l is None else l`? will this create any problems?
TheMachineCharmer
@TheMachineCharmer: When you actually tried it, what did you observe?
S.Lott
It worked fine. :-)
TheMachineCharmer
@TheMachineCharmer: Then why ask?
S.Lott
I thought there might be some pitfalls I don't know of. Till date I was considering use of `[]` as default value perfectly fine. Thanks! :)
TheMachineCharmer
@S. Lott: Why never do this? If one knows the semantics of the `l = []` parameter, I don't see why it should not be used on purpose.
EOL
@EOL: If one "knows" the semantics, one is free to break the rule. I **always** write absolutes. I find that **always** stating absolutes forces people to think more clearly. But, for questions like this, I find that **always** stating things as absolutes **always** helps. **Always**.
S.Lott
@TheMachineCharmer: Here's my point. You had a test case that demonstrated the problem with `[]` as a default value. You now have a solution that passes the unit test. This is the way to write software: find a problem; create test case that shows the problem; solve the problem. You did exactly that.
S.Lott
+1  A: 

The culprit is the l=[] defined in the Bar class definition. This list is instantiated once during class definition and is used as the default. Super dangerous!! I and many others have been burned by this one, trust me the scarring is deep.

Problematic use of mutable.

class Bar(object):
    def __init__(self,l = []):
        self.l = l
    def add(self,o):
        self.l += [o]
    def __repr__(self): return str(self.l)

Try using an immutable:

class Bar(object):
    def __init__(self,l = None):
        if l is None:
            self.l = []
        else:
            self.l = l
    def add(self,o):
        self.l += [o]
    def __repr__(self): return str(self.l)
kevpie
+1  A: 

Others have explained the problem and suggested using l=None and an explicit test for it. I'd like to suggest a different approach:

class Bar(object):
    def __init__(self, l=[]):
        self.l = list(l)
        # and so on...

This guarantees a new blank list each time by default, and also assures that if a caller passes in a list of their own, you get a copy of that list rather than a reference to the caller's list. As an added benefit, callers can pass in anything that the list constructor can consume, such as a tuple or a string, and your code doesn't have to worry about that; it can just deal with a list.

If you just save a reference to the list the caller gives you, and later change the list named self.l, you may be inadvertently changing the list that was passed in, too (since both your class and the caller now have a reference to the same object). Similarly, if they change the list after calling your constructor, your "copy" will be changed too (since it's not actually a copy). Of course, it could be that this is behavior you want, but probably not in this case.

Although if you never manipulate the list (i.e. add, replace, or delete items), but only refer to it or replace it wholesale, copying it is usually a waste of time and memory.

The copy made using the list() constructor is shallow. That is, the list itself is a new object, but if the original list contains references to mutable objects (such as other lists or dictionaries, or instances of most other classes), the same problem can arise if you change those objects, because they are still shared between the lists. This is an issue less frequently than you might think, but if it is, you can perform a deep copy using the deepcopy() function in the copy module.

kindall
I like this. In addition to receiving tuple or strings. Iterables and even generator expressions can be used. Alternatively self.l = iter(l) if you are simply going to loop through the input but don't want to walk the input immediately. This is not applicable to this specific question but still an interesting use of the pattern.
kevpie
Nice note on the iter().
kindall