views:

167

answers:

2

Hi,

I have a problem with my python application, and I think it's related to the python garbage collection, even if I'm not sure...

The problem is that my application takes a lot of time to exit and to switch to one function to the next one.

In my application I handle very large dictionaries, containing thousands of large objects which are instantiated from wrapped C++ classes.

I put some timestamp outputs in my program, and I saw that at the end of each function, when objects created inside the function should go out of scope, a lot of time is spent by the interpreter before calling the next function. And I observe the same problem at the end of the application, when the program should exit: a lot of time (~ hours!) is spent between the last timestamp on screen and the appearance of the fresh prompt.

The memory usage is stable, so I don't really have memory leaks.

Any suggestions?

Can be the garbage collection of thousands of large C++ objects that slow?

Is there a method to speed up that?

Many thanks for your help,

Ric.

UPDATE:

Thanks a lot for all your answers, you gave me a lot of hints to debug my code :-)

I use Python 2.6.5 on Scientific Linux 5, a customized distribution based on Red Hat Enterprise 5. And actually I'm not using SWIG to get Python bindings for our C++ code, but the Reflex/PyROOT framework. I know, it's not very known outside particle physics (but still open source and freely available) and I have to use it because it's the default for our main framework.

And in this context the DEL command from the Python side does not work, I had already tried it. DEL only deletes the python variable linked to the C++ object, not the object itself in memory, which is still owned by the C++ side...

...I know, it's not-standard I guess, and a bit complicated, sorry :-P

But following your hints, I'll profile my code and I'll come back to you with more details, as you suggested.

Thanks a lot again for all your help! :-)

Ric.

ADDITIONAL UPDATE:

Ok, following your suggestions, I instrumented my code with cProfile, and I discovered that actually the gc.collect() function is the function taking the most of the running time!!

Here the output from cProfile + pstats print_stats():



    >>> p.sort_stats("time").print_stats(20)
Wed Oct 20 17:46:02 2010    mainProgram.profile

         547303 function calls (542629 primitive calls) in 548.060 CPU seconds

   Ordered by: internal time
   List reduced from 727 to 20 due to restriction 

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        4  345.701   86.425  345.704   86.426 {gc.collect}
        1  167.115  167.115  200.946  200.946 PlotD3PD_v3.2.py:2041(PlotSamplesBranches)
       28   12.817    0.458   13.345    0.477 PlotROOTUtils.py:205(SaveItems)
     9900   10.425    0.001   10.426    0.001 PlotD3PD_v3.2.py:1973(HistoStyle)
     6622    5.188    0.001    5.278    0.001 PlotROOTUtils.py:403(__init__)
       57    0.625    0.011    0.625    0.011 {built-in method load}
      103    0.625    0.006    0.792    0.008 dbutils.py:41(DeadlockWrap)
       14    0.475    0.034    0.475    0.034 {method 'dump' of 'cPickle.Pickler' objects}
     6622    0.453    0.000    5.908    0.001 PlotROOTUtils.py:421(CreateCanvas)
    26455    0.434    0.000    0.508    0.000 /opt/root/lib/ROOT.py:215(__getattr__)
[...]




>>> p.sort_stats("cumulative").print_stats(20)
Wed Oct 20 17:46:02 2010    mainProgram.profile

         547303 function calls (542629 primitive calls) in 548.060 CPU seconds

   Ordered by: cumulative time
   List reduced from 727 to 20 due to restriction 

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.001    0.001  548.068  548.068 PlotD3PD_v3.2.py:2492(main)
        4    0.000    0.000  346.756   86.689 /usr/lib//lib/python2.5/site-packages/guppy/heapy/Use.py:171(heap)
        4    0.005    0.001  346.752   86.688 /usr/lib//lib/python2.5/site-packages/guppy/heapy/View.py:344(heap)
        1    0.002    0.002  346.147  346.147 PlotD3PD_v3.2.py:2537(LogAndFinalize)
        4  345.701   86.425  345.704   86.426 {gc.collect}
        1  167.115  167.115  200.946  200.946 PlotD3PD_v3.2.py:2041(PlotBranches)
       28   12.817    0.458   13.345    0.477 PlotROOTUtils.py:205(SaveItems)
     9900   10.425    0.001   10.426    0.001 PlotD3PD_v3.2.py:1973(HistoStyle)
    13202    0.336    0.000    6.818    0.001 PlotROOTUtils.py:431(PlottingCanvases)
     6622    0.453    0.000    5.908    0.001 /root/svn_co/rbianchi/SoftwareDevelopment

[...]

>>>

So, in both outputs, sorted by "time" and by "cumulative" time respectively, gc.collect() is the function consuming the most of the running time of my program! :-P

And this is the output of the memory profiler Heapy, just before returning the main() program.


memory usage before return:
Partition of a set of 65901 objects. Total size = 4765572 bytes.
 Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
     0  25437  39  1452444  30   1452444  30 str
     1   6622  10   900592  19   2353036  49 dict of PlotROOTUtils.Canvas
     2    109   0   567016  12   2920052  61 dict of module
     3   7312  11   280644   6   3200696  67 tuple
     4   6622  10   238392   5   3439088  72 0xa4ab74c
     5   6622  10   185416   4   3624504  76 PlotROOTUtils.Canvas
     6   2024   3   137632   3   3762136  79 types.CodeType
     7    263   0   129080   3   3891216  82 dict (no owner)
     8    254   0   119024   2   4010240  84 dict of type
     9    254   0   109728   2   4119968  86 type
  Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
    10   1917   3   107352   2   4264012  88 function
    11   3647   5   102116   2   4366128  90 ROOT.MethodProxy
    12    148   0    80800   2   4446928  92 dict of class
    13   1109   2    39924   1   4486852  93 __builtin__.wrapper_descriptor
    14    239   0    23136   0   4509988  93 list
    15     87   0    22968   0   4532956  94 dict of guppy.etc.Glue.Interface
    16    644   1    20608   0   4553564  94 types.BuiltinFunctionType
    17    495   1    19800   0   4573364  94 __builtin__.weakref
    18     23   0    11960   0   4585324  95 dict of guppy.etc.Glue.Share
    19    367   1    11744   0   4597068  95 __builtin__.method_descriptor

Any idea why, or how to optimize the garbage collection?

Is there any more detailed check I can do?

And again...thanks a lot for all your so valuable help!! :-)

Ric.

+5  A: 

Yes, it could be garbage collection, but it could also be some synchronisation with the C++ code, or something completely different (hard to say without code).

Anyway, you should have a look at SIG for development of Python/C++ integration to find issues and how to speed up things.

kriss
Hi kriss, apparently it has a garbage collector problem. I updated my question with the output of cProfile, and it seems that gc.collect takes the most of the running time...I'll take a look at the SIG Python/C++ as you suggested, to see if it is a known problem in some contexts.
rmbianchi
raw bet: the gc is probably in a case where refcounting doesn't work well (has to use a more complex strategy to reclaim memory). Any loops in your data structure (cyclic references) ? If it's the source of the problem cutting them "by hand" could be a solution, it would make gc happier (hence faster).
kriss
+2  A: 

If your problem really is the garbage collection, try explicitly freeing your objects when you're done with them using del().

In general, this doesn't sound like a garbage collection problem, unless we're talking about terabytes of memory.

I agree with S.Lott... profile your app, then bring code snippets and the results of that back and we can be much more helpful.

Paul McMillan
`del` doesn't free anything. It just removes a variable from the current scope, i.e. removes one reference. But Python is refcounted (a more sophisticated GC does exist and is run, but only on cyclic references) - it doesn't really matter if a bunch of objects get gc'd at the end of the function or in little chunks when youthink you're done.
delnan
In general, yes. In the pathological case, freeing little chunks might help. Habitually using `del` all over the place tends to be a sign you're not programming in Python.
Paul McMillan
Thanks, Paul and Delnan. Actually I also tried to use del(), but in this context does not work. As I told in the update of my question, I'm using an open source framework called ROOT (http://root.cern.ch), with its own python bindings system (called Reflex), and even if del() deletes all the python references to a C++ object, the object itself stays in memory... But even though now I found a way to explicitely delete them as well, the gc.collect function seems to take the most of the running time... Any suggestions for further checks? Thanks a lot again for your help
rmbianchi