views:

73

answers:

1

I got an unexpected closure when creating a nested class. I suspect that this is something related to metaclasses, super, or both. It is definitely related to how closures get created. I am using python2.7.

Here are five simplified examples that demonstrate the same problem that I am seeing (they all build off the first):

EXAMPLE 1:

class Metaclass(type): 
    def __init__(self, name, bases, dict): 
        self.CONST = 5 

class Base(object): 
    __metaclass__=Metaclass 
    def __init__(self): 
        "Set things up." 

class Subclass(Base):
    def __init__(self, name):
        super(Subclass, self).__init__(self)
        self.name = name
    def other(self, something): pass

class Test(object):
    def setup(self):
        class Subclass(Base):
            def __init__(self, name):
                super(Subclass, self).__init__(self)
                self.name = name
            def other(self, something): pass
        self.subclass = Subclass
        class Subclass2(Base):
            def __init__(self, name):
                super(Subclass, self).__init__(self)
        self.subclass2 = Subclass2

"0x%x" % id(Metaclass)
# '0x8257f74'
"0x%x" % id(Base)
# '0x825814c'
t=Test()
t.setup()
"0x%x" % id(t.subclass)
# '0x8258e8c'
"0x%x" % id(t.subclass2)
# '0x825907c'
t.subclass.__init__.__func__.__closure__
# (<cell at 0xb7d33d4c: Metaclass object at 0x8258e8c>,)
t.subclass.other.__func__.__closure__
# None
t.subclass2.__init__.__func__.__closure__
# (<cell at 0xb7d33d4c: Metaclass object at 0x8258e8c>,)
Subclass.__init__.__func__.__closure__
# None

EXAMPLE 2:

class Test2(object):
    def setup(self):
        class Subclass(Base):
            def __init__(self, name):
                self.name = name
            def other(self, something): pass
        self.subclass = Subclass

t2=Test2()
t2.setup()
t2.subclass.__init__.__func__.__closure__
# None

EXAMPLE 3:

class Test3(object):
    def setup(self):
        class Other(object):
            def __init__(self): 
                super(Other, self).__init__()
        self.other = Other
        class Other2(object):
            def __init__(self): pass
        self.other2 = Other2

t3=Test3()
t3.setup()
"0x%x" % id(t3.other)
# '0x8259734'
t3.other.__init__.__func__.__closure__
# (<cell at 0xb7d33e54: type object at 0x8259734>,)
t3.other2.__init__.__func__.__closure__
# None

EXAMPLE 4:

class Metaclass2(type): pass

class Base2(object): 
    __metaclass__=Metaclass2 
    def __init__(self): 
        "Set things up." 

class Base3(object): 
    __metaclass__=Metaclass2 

class Test4(object):
    def setup(self):
        class Subclass2(Base2):
            def __init__(self, name):
                super(Subclass2, self).__init__(self)
        self.subclass2 = Subclass2
        class Subclass3(Base3):
            def __init__(self, name):
                super(Subclass3, self).__init__(self)
        self.subclass3 = Subclass3
        class Subclass4(Base3):
            def __init__(self, name):
                super(Subclass4, self).__init__(self)
        self.subclass4 = Subclass4

"0x%x" % id(Metaclass2)
# '0x8259d9c'
"0x%x" % id(Base2)
# '0x825ac9c'
"0x%x" % id(Base3)
# '0x825affc'
t4=Test4()
t4.setup()
"0x%x" % id(t4.subclass2)
# '0x825b964'
"0x%x" % id(t4.subclass3)
# '0x825bcac'
"0x%x" % id(t4.subclass4)
# '0x825bff4'
t4.subclass2.__init__.__func__.__closure__
# (<cell at 0xb7d33d04: Metaclass2 object at 0x825b964>,)
t4.subclass3.__init__.__func__.__closure__
# (<cell at 0xb7d33e9c: Metaclass2 object at 0x825bcac>,)
t4.subclass4.__init__.__func__.__closure__
# (<cell at 0xb7d33ddc: Metaclass2 object at 0x825bff4>,)

EXAMPLE 5:

class Test5(object):
    def setup(self):
        class Subclass(Base):
            def __init__(self, name):
                Base.__init__(self)
        self.subclass = Subclass

t5=Test5()
t5.setup()
"0x%x" % id(t5.subclass)
# '0x8260374'
t5.subclass.__init__.__func__.__closure__
# None

Here is what I understand (referencing examples):

  • Metaclasses are inherited, so Subclass gets Base’s metaclass.
  • Only __init__ is affected, Subclass.other method is not (#1).
  • Removing Subclass.other does not make a difference (#1).
  • Removing self.name=name from Subclass.__init__ does not make a difference (#1).
  • The object in the closure cell is not a function.
  • The object is not Metaclass or Base, but some object of type Metaclass, just like Base is (#1).
  • The object is actually an object of the type of the nested Subclass (#1).
  • The closure cells for t1.subclass.__init__ and t1.subclass2.__init__ are the same, even though they are from two different classes (#1).
  • When I do not nest the creation of Subclass (#1) then there is no closure created.
  • When I do not call super(...).__init__ in Subclass.init__ no closure is created (#2).
  • If I assign no __metaclass__ and inherit from object then the same behavior shows up (#3).
  • The object in the closure cell for t3.other.__init__ is t3.other (#3).
  • The same behavior happens if the metaclass has no __init__ (#4).
  • The same behavior happens if the Base has no __init__ (#4).
  • The closure cells for the three subclasses in example 4 are all different and each matches the corresponding class (#4).
  • When super(...).__init__ is replaced with Base.__init__(self), the closure disappears (#5).

Here is what I do not understand:

  • Why does a closure get set for __init__?
  • Why doesn't the closure get set for other?
  • Why is the object in the closure cell set to the class to which __init__ belongs?
  • Why does this only happen when super(...).__init__ is called?
  • Why doesn't this happen when Base.__init__(self) is called?
  • Does this actually have anything at all to do with using metaclasses (probably, since the default metaclass is type)?

Thanks for the help!

-eric

(Update) Here is something that I found then (based on Jason's insight):

def something1():
    print "0x%x" % id(something1)
    def something2():
        def something3():
            print "0x%x" % id(something1)
            print "0x%x" % id(something2)
            print "0x%x" % id(something3)
        return something3
    return something2

something1.__closure__
# None
something1().__closure__
# 0xb7d4056c
# (<cell at 0xb7d33eb4: function object at 0xb7d40df4>,)
something1()().__closure__
# 0xb7d4056c
# (<cell at 0xb7d33fec: function object at 0xb7d40e64>, <cell at 0xb7d33efc: function object at 0xb7d40e2c>)
something1()()()
# 0xb7d4056c
# 0xb7d4056c
# 0xb7d40e9c
# 0xb7d40ed4

First, a function's name is in scope within its own body. Second, functions get closures for the functions in which they are defined if they reference those functions.

I hadn't realized that the function name was in scope like that. The same goes for classes. When a class is defined within a function's scope, any references to that class name inside the class's methods cause the class to bound in a closure on that method's function, like so:

def test():
    class Test(object):
        def something(self):
            print Test
    return Test

test()
# <class '__main__.Test'>
test().something.__func__.__closure__
# (<cell at 0xb7d33c2c: type object at 0x825e304>,)

However, since closures cannot be created on non-functions the following fails:

def test():
    class Test(object):
        SELF=Test
        def something(self):
            print Test
    return Test

# Traceback (most recent call last):
#   File "<stdin>", line 1, in <module>
#   File "<stdin>", line 2, in test
#   File "<stdin>", line 3, in Test
# NameError: free variable 'Test' referenced before assignment in enclosing scope

Good stuff!

+3  A: 

Why does a closure get set for __init__?

It refers to a local variable (namely Subclass) in the enclosing function (namely setup).

Why doesn't the closure get set for other?

Because it doesn't refer to any local variables (or parameters) in any enclosing functions.

Why is the object in the closure cell set to the class to which __init__ belongs?

That is the value of the enclosing variable being referred to.

Why does this only happen when super(...).__init__ is called?

Why doesn't this happen when Base.__init__(self) is called?

Because Base is not a local variable in any enclosing function.

Does this actually have anything at all to do with using metaclasses?

No.

Jason Orendorff
Spot on. That would explain why the closure for `Test.subclass2.__init__` is the same as the one for `Test.subclass.__init__`. I have a typo in `Test.subclass2.__init__` calling `super(Subclass, self).__init__()` when it should be `super(Subclass2, self).__init__()`.So I see how by referencing Subclass (or similar) in the super call I am pulling it into the closure. Seems funny then that the same does not happen when I call `Base.__init__(self)` in example 5. Is that because Base is a global?
Eric Snow
It's odd to me that it works like this with class definitions. Unexpected.
Eric Snow
"[...] Is that because Base is a global?" Yes.
Jason Orendorff
"It's odd to me that it works like this with class definitions. Unexpected." Class names and function names in Python are just like any other variable name, whether they're local or global. You can think of `def` and `class` as just fancy assignment statements.
Jason Orendorff
Interesting. Guess I had never realized that a function's name is in scope to the function's body.
Eric Snow
Something is funny though. When you define a class (or function) in the global scope no closure gets created even though you are referring to the class name inside one of its methods and it is a free variable in that methods local scope.
Eric Snow
I presume that the class name is put into the scope in which it is defined (and the class object is bound to it after the definition body is executed). If it is some function's scope then a closure gets created on the method. If it is the global scope then no closure gets created. I would have expected no closure to get created in either instance.
Eric Snow
I would have thought that class and function names would have been special cases for the generation of closures (or non-generation rather).
Eric Snow
Globals go through `func_global` rather than `func_closure`. The global namespace is quite different from local namespaces in Python. Not just as an implementation detail.
Jason Orendorff
"I would have thought that class and function names would have been special cases for the generation of closures" I don't understand. What did you expect? Different language semantics? Or just a different implementation technique?
Jason Orendorff
Yeah, I see it now. The only way the `__init__` method knows about its own class is through either self, or through a closure. The function does not know anything about the class in which the function is wrapped as a method. So in this case the closure is the only way for it to know about the class. I was forgetting the distinction between the method and the function. In the end `super` caused the situation that led me on this trip.
Eric Snow