views:

469

answers:

3

I have a Python program that runs a series of experiments, with no data intended to be stored from one test to another. My code contains a memory leak which I am completely unable to find (I've look at the other threads on memory leaks). Due to time constraints, I have had to give up on finding the leak, but if I were able to isolate each experiment, the program would probably run long enough to produce the results I need.

  • Would running each test in a separate thread help?
  • Are there any other methods of isolating the effects of a leak?

Detail on the specific situation

  • My code has two parts: an experiment runner and the actual experiment code.
  • Although no globals are shared between the code for running all the experiments and the code used by each experiment, some classes/functions are necessarily shared.
  • The experiment runner isn't just a simple for loop that can be easily put into a shell script. It first decides on the tests which need to be run given the configuration parameters, then runs the tests then outputs the data in a particular way.
  • I tried manually calling the garbage collector in case the issue was simply that garbage collection wasn't being run, but this did not work

Update

Gnibbler's answer has actually allowed me to find out that my ClosenessCalculation objects which store all of the data used during each calculation are not being killed off. I then used that to manually delete some links which seems to have fixed the memory issues.

+2  A: 

I would simply refactor the experiments into individual functions (if not like that already) then accept an experiment number from the command line which calls the single experiment function.

The just bodgy up a shell script as follows:

#!/bin/bash

for expnum in 1 2 3 4 5 6 7 8 9 10 11 ; do
    python youProgram ${expnum} otherParams
done

That way, you can leave most of your code as-is and this will clear out any memory leaks you think you have in between each experiment.

Of course, the best solution is always to find and fix the root cause of a problem but, as you've already stated, that's not an option for you.

Although it's hard to imagine a memory leak in Python, I'll take your word on that one - you may want to at least consider the possibility that you're mistaken there, however. Consider raising that in a separate question, something that we can work on at low priority (as opposed to this quick-fix version).

Update: Making community wiki since the question has changed somewhat from the original. I'd delete the answer but for the fact I still think it's useful - you could do the same to your experiment runner as I proposed the bash script for, you just need to ensure that the experiments are separate processes so that memory leaks dont occur (if the memory leaks are in the runner, you're going to have to do root cause analysis and fix the bug properly).

paxdiablo
I did consider just writing a shell script, but unfortunately my experimental code is much more complex than that
Casebash
@paxdiablo: I agree that this answer should be left as it could be helpful for anyone else who visits this question
Casebash
+3  A: 

Threads would not help. If you must give up on finding the leak, then the only solution to contain its effect is running a new process once in a while (e.g., when a test has left overall memory consumption too high for your liking -- you can determine VM size easily by reading /proc/self/status in Linux, and other similar approaches on other OS's).

Make sure the overall script takes an optional parameter to tell it what test number (or other test identification) to start from, so that when one instance of the script decides it's taking up too much memory, it can tell its successor where to restart from.

Or, more solidly, make sure that as each test is completed its identification is appended to some file with a well-known name. When the program starts it begins by reading that file and thus knows what tests have already been run. This architecture is more solid because it also covers the case where the program crashes during a test; of course, to fully automate recovery from such crashes, you'll want a separate watchdog program and process to be in charge of starting a fresh instance of the test program when it determines the previous one has crashed (it could use subprocess for the purpose -- it also needs a way to tell when the sequence is finished, e.g. a normal exit from the test program could mean that while any crash or exit with a status != 0 signify the need to start a new fresh instance).

If these architectures appeal but you need further help implementing them, just comment to this answer and I'll be happy to supply example code -- I don't want to do it "preemptively" in case there are as-yet-unexpressed issues that make the architectures unsuitable for you. (It might also help to know what platforms you need to run on).

Alex Martelli
Thanks heaps for the offer, but I managed to find the leak
Casebash
+9  A: 

You can use something like this to help track down memory leaks

>>> from collections import defaultdict
>>> from gc import get_objects
>>> before=defaultdict(int)
>>> after=defaultdict(int)
>>> for i in get_objects():before[type(i)]+=1
...

now suppose the tests leaks some memory

>>> leaked_things=[[x] for x in range(10)]
>>> for i in get_objects():after[type(i)]+=1
... 
>>> print [(k,after[k]-before[k]) for k in after if after[k]-before[k]]
[(<type 'list'>, 11)]

11 because we have leaked one list containing 10 more lists

gnibbler
Wow, that is pretty useful. Although, this really does belong on one of the how do I find memory leaks threads
Casebash
Is it worth doing a garbage collection before comparing the objects?
Casebash
Thanks, I managed to use this to solve the problem.
Casebash
That's great. Fixing is better than working around it
gnibbler