views:

239

answers:

5

"Learning Python, 4th Ed." mentions that "the enclosing scope variable is looked up when the nested functions are later called.." However, I thought that when a function exits, all of its local references disappear.

def makeActions():
    acts = []
    for i in range(5): # Tries to remember each i
        acts.append(lambda x: i ** x) # All remember same last i!
return acts

makeActions()[n] is the same for every n because the variable i is somehow looked up at call time. How does Python look up this variable? Shouldn't it not exist at all because makeActions has already exited? Why doesn't Python do what the code intuitively suggests, and define each function by replacing i with its current value within the for loop as the loop is running?

From comment to THC4k:
I think I was mistaken in the way Python builds functions in memory. I was thinking that when encountering a def or a lambda, Python would generate all the necessary machine instructions correlating to that function and save it somewhere in memory. Now I think it's more like Python saves the function as a text string (and bundles it with references needed for the closure) and re-parses it each time the function is called.

+1  A: 

The local references persist because they're contained in the local scope, which the closure keeps a reference to.

Ignacio Vazquez-Abrams
A: 

Intuitively one might think i would be captured in its current state but that is not the case. Think of each layer as a dictionary of name value pairs.

    Level 1:
        acts
        i
    Level 2:
        x

Every time you create a closure for the inner lambda you are capturing a reference to level one. I can only assume that the run-time will perform a look-up of the variable i, starting in level 2 and making its way to level 1. Since you are not executing these functions immediately they will all use the final value of i.

Experts?

ChaosPandion
that's correct -- each lambda function keeps a reference to the same `i`.
Igor
+8  A: 

I think it's pretty obvious what happens when you think of i as a name not some sort of value. Your lambda function does something like "take x: look up the value of i, calculate i**x" ... so when you actually run the function, it looks up i just then so i is 4.

You can also use the current number, but you have to make Python bind it to another name:

def makeActions():
    def make_lambda( j ):
        return lambda x: j * x # the j here is still a name, but now it wont change anymore

    acts = []
    for i in range(5):
        # now you're pushing the current i as a value to another scope and 
        # bind it there, under a new name
        acts.append(make_lambda(i))
    return acts

It might seem confusing, because you often get taught that a variable and it's value are the same thing -- which is true, but only in languages that actually use variables. Python has no variables, but names instead.

About your comment, actually i can illustrate the point a bit better:

i = 5 
myList = [i, i, i] 
i = 6
print(myList) # myList is still [5, 5, 5].

You said you changed i to 6, that is not what actually happend: i=6 means "i have a value, 6 and i want to name it i". The fact that you already used i as a name matters nothing to Python, it will just reassign the name, not change it's value (that only works with variables).

You could say that in myList = [i, i, i], whatever value i currently points to (the number 5) gets three new names: mylist[0], mylist[1], mylist[2]. That's the same thing that happens when you call a function: The arguments are given new names. But that is probably going against any intuition about lists ...

This can explain the behavior in the example: You assign mylist[0]=5, mylist[1]=5, mylist[2]=5 - no wonder they don't change when you reassign the i. If i was something muteable, for example a list, then changing i would reflect on all entries in myList too, because you just have different names for the same value!

The simple fact that you can use mylist[0] on the left hand of a = proves that it is indeed a name. I like to call = the assign name operator: It takes a name on the left, and a expression on the right, then evaluates the expression (call function, look up the values behind names) until it has a value and finally gives the name to the value. It does not change anything.

For Marks comment about compiling functions:

Well, references (and pointers) only make sense when we have some sort of addressable memory. The values are stored somewhere in memory and references lead you that place. Using a reference means going to that place in memory and doing something with it. The problem is that none of these concepts are used by Python!

The Python VM has no concept of memory - values float somewhere in space and names are little tags connected to them (by a little red string). Names and values exist in separate worlds!

This makes a big difference when you compile a function. If you have references, you know the memory location of the object you refer to. Then you can simply replace then reference with this location. Names on the other hand have no location, so what you have to do (during runtime) is follow that little red string and use whatever is on the other end. That is the way Python compiles functions: Where ever there is a name in the code, it adds a instruction that will figure out what that name stands for.

So basically Python does fully compile functions, but names are compiled as lookups in the nesting namespaces, not as some sort of reference to memory.

When you use a name, the Python compiler will try to figure out where to which namespace it belongs to. This results in a instruction to load that name from the namespace it found.

Which brings you back to your original problem: In lambda x:x**i, the i is compiled as a lookup in the makeActions namespace (because i was used there). Python has no idea, nor does it care about the value behind it (it does not even have to be a valid name). One that code runs the i gets looked up in it's original namespace and gives the more or less expected value.

THC4k
`i` will be 4, not 6 by then.
carl
I think this way of thinking of references is inconsistent with the way references usually work in Python.Consider: i = 5; myList = [i, i, i]; i = 6; print(myList);Although i has changed to 6, myList is still [5, 5, 5]. Creation of a function, to me, should be similar to a creation of a list. The index references of the list point to a spot in memory upon creation of the list. Why don't the variables of a lambda function point to spots in memory upon creation of a lambda function? (Sorry I can't seem to do markdown in a comment.)
Mark
@Mark - In your example `i = 5; myList = [i, i, i]; i = 6;` the value of `i` will be resolved and assigned to a new name.
ChaosPandion
@ChaosPandion - Shouldn't the case with lambda functions / closures do the same thing? Whenever I define an object (including functions) using the reference i, Python should look up where i points to in memory right?
Mark
@Mark - The problem is that you are not immediately calling the lambda so the name `i` will remain unresolved until you do. By that point the value of `i` is already 4.
ChaosPandion
@THC4k - When I was writing an implementation of ECMAScript I was surprised to find out that lists work exactly as you said. Basically the list object would have 3 named values `0`, `1`, `2`. Now it could be different in Python but I get the feeling that it isn't.
ChaosPandion
@THC4k - I understand the difference between references and objects and I realize that myList[0] is a reference. That wasn't the source of my confusion. However I think I was mistaken in the way Python builds functions in memory. I was thinking that when encountering a def or a lambda, Python would generate all the necessary machine instructions correlating to that function and save it somewhere in memory. Now I think it's more like Python saves the function as a text string (and bundles it with references needed for the closure) and re-parses it each time the function is called.
Mark
@ChaosPandion - I'm sure Python lists are not just Python dictionaries with integer keys, as is the case with Javascript. I think I read that they are implemented like Java's ArrayList. I think THC4k was trying to make the point that myList[n] is a reference, not that n is the name of a key in a Python dictionary.
Mark
@Mark - Big surprise that I am wrong, take notice of the lack of Python tags under my belt. :)
ChaosPandion
Good work on this answer. It has improved the clarity of my existing knowledge on the subject.
ChaosPandion
This bites the hardest when you try, for example, to make a list of empty lists. For example, `decks = [[]]*len(players)`; `for deck in decks:` `for i in range 3:` `deck.append(random.choice(cards))` does _not_ result in three cards given to each player.
badp
@ChaosPandion: Thanks, its a complicated topic and I hoped this explains it a bit. The bit about names for list items is not the way it's actually implemented, but behavior is the same. So thinking about lists this way can be useful sometimes. @Mark You were not far off with your idea of how Python compiles function, but the subtle difference between names and references lead you the wrong way. Maybe the bit about how Python compiles in contrast with the way static languages compile was useful?
THC4k
@THC4k - It's a little clearer to me now. It seems that makeAction's namespace is not entirely garbage collected after execution because Python sees that a lambda function has a reference to the name i.
Mark
+3  A: 

What happens when you create a closure:

  • The closure is constructed with a pointer to the frame (or roughly, block) that it was created in: in this case, the for block.
  • The closure actually assumes shared ownership of that frame, by incrementing the frame's ref count and stashing the pointer to that frame in the closure. That frame, in turn, keeps around references to the frames it was enclosed in, for variables that were captured further up the stack.
  • The value of i in that frame keeps changing as long as the for loop is running – each assignment to i updates the binding of i in that frame.
  • Once the for loop exits, the frame is popped off the stack, but it isn't thrown away as it might usually be! Instead, it's kept around because the closure's reference to the frame is still active. At this point, though, the value of i is no longer updated.
  • When the closure is invoked, it picks up whatever value of i is in the parent frame at the time of invocation. Since in the for loop you create closures, but don't actually invoke them, the value of i upon invocation will be the last value it had after all the looping was done.
  • Future calls to makeActions will create different frames. You won't reuse the for loop's previous frame, or update that previous frame's i value, in that case.

In short: frames are garbage-collected just like other Python objects, and in this case, an extra reference is kept around to the frame corresponding to the for block so it doesn't get destroyed when the for loop goes out of scope.

To get the effect you want, you need to have a new frame created for each value of i you want to capture, and each lambda needs to be created with a reference to that new frame. You won't get that from the for block itself, but you could get that from a call to a helper function which will establish the new frame. See THC4k's answer for one possible solution along these lines.

Owen S.
So for loops are actually objects in Python? Could you please recommend some resources where I could learn about frames? This answer is more along the lines of what I was looking for, thanks.
Mark
Owen S.
A: 

I thought that when a function exits, all of its local references disappear.

Except for those locals which are closed over in a closure. Those do not disappear, even when the function to which they are local has returned.

Justice