views:

213

answers:

2

I am working on a library that loads files (hfd5 - pytables) into an object structure. The actual classes being used for the structure is loaded as a string from the hdf5 file, and then loaded in this fashion:

class NamespaceHolder(dict):
    # stmt is the source code holding all the class defs
    def execute(self, stmt):
        exec stmt in self

The problem is, loading multiple classes like this, causes objects to appear in the uncollectible part of the garbage collection, namely the actual class definitions. I can also load this into a global dictionary, but the problem remains of orphaned classes. Is there any way to unload the classes?

The main problem is the class.mro attribute, which contains a reference back to the class itself, causing circular references that the garbage collector can't handle.

Here is a small test case to see for yourselves:

import gc

if __name__ == "__main__":
    gc.enable()
    gc.set_debug(gc.DEBUG_LEAK)

    code = """
class DummyA(object):
    pass
"""
    context = {}

    exec code in context
    exec code in context

    gc.collect()
    print len(gc.garbage)

Just a note: I have already argued against using parsing off text in a file for creating classes earlier, but apparently they are set on using it here and see some benefits I don't, so going away from this solution isn't feasible now.

+1  A: 

I think the GC can cope with circular references, however you'll need to do is remove the reference from the globals() dict:

try:
    del globals()['DummyA']
except KeyError:
    pass

otherwise there will be a non-circular reference to the class object that will stop it being cleaned up.

workmad3
Take a look at http://stackoverflow.staale.org/919924.pngThis is the cyclic graph that comes from running the above code. The tuple refered to here contains (DummyA, object), and can't be removed or altered, and as such the DummyA class can't be collected
Staale
+1  A: 

The gc.set_debug(gc.DEBUG_LEAK) causes the leak. Try this:

import gc

def foo():                              
    code = """
class DummyA(object):
    pass             
"""
    context = {}
    exec code in context
    exec code in context

    gc.collect()
    print len(gc.garbage), len(gc.get_objects())

gc.enable()
foo(); foo() # amount of objects doesn't increase
gc.set_debug(gc.DEBUG_LEAK)
foo() # leaks
Ants Aasma
Thanks a lot. I did however notice that all off my leaks went away when I turned off gc.DEBUG_LEAK
Staale