ansaurus

Question

Why am I leaking memory with this python loop?

Answer 1

+2 A:

I cannot reproduce any actual leak on my system, but I think your "every 100th iteration, 100 objects are freed" is you hitting the cache for compiled regular expressions (via the glob module). If you peek at re.py you'll see _MAXCACHE defaults to 100, and by default the entire cache is blown away once you hit that (in _compile). If you call re.purge() before your gc calls you will probably see that effect go away.

(note I'm only suggesting re.purge() here to check that cache is affecting your gc results. It should not be necessary to have that in your actual code.)

I doubt that fixes your massive memory increase problem though.

mzz 2010-02-02 12:55:47

Thanks for this - when I did what you suggested, the effect indeed went away, and the new objects per loop changed to 2. It doesn't fix the memory increase problem, but it's certainly going to help understand what's going on.

Andy 2010-02-02 17:21:10

Answer 2

+4 A:

I tracked this down to the fnmatch module. glob.glob calls fnmatch to actually perform the globbing, and fnmatch has a cache of regular expressions which is never cleared. So in this usage, the cache was growing continuously and unchecked. I've filed a bug against the fnmatch library [1].

[1]: http://bugs.python.org/issue7846 Python Bug

Andy 2010-02-03 16:01:09

I wonder how I managed to spot the similar cache in the re module but not this one! Perhaps I should subtract a point from my own answer for that...

mzz 2010-02-04 00:53:34

ansaurus

tags:

views:

answers:

Why am I leaking memory with this python loop?

related questions