ansaurus

Question

Answer 1

A:

Python allocates Unicode objects from the C heap. So when you allocate many of them (along with other malloc blocks), then release most of them except for the very last one, C malloc will not return any memory to the operating system, as the C heap will only shrink on the end (not in the middle). Releasing the last Unicode object will release the block at the end of the C heap, which then allows malloc to return it all to the system.

On top of these problems, Python also maintains a pool of freed unicode objects, for faster allocation. So when the last Unicode object is freed, it isn't returned to malloc right away, making all the other pages stuck.

Martin v. Löwis 2009-07-28 14:36:47

Just to clarify, you're saying that the heapy output shows the contents of the python private heap. The top output shows the total memory allocated to my process through the C API? If that is the case, can you suggest any techniques to ensure that python releases the raw memory back to the OS once the contents of that memory is no longer needed?

Asif Rahman 2009-07-28 14:57:10

As Roberto says: use the latest Python version. If the problem does not go away, report a bug to bugs.python.org, preferably including a fix.

Martin v. Löwis 2009-07-28 15:03:17

Answer 2

+1 A:

What version of python are you using?
I am asking because older version of CPython did not release the memory and this was fixed in Python 2.5.

Roberto Liffredo 2009-07-28 14:41:07

I was using CPython 2.4 when I came across the link that you posted. Unfortunately, I'm experiencing the same behavior when I run the scripts in 2.6.

Asif Rahman 2009-07-28 14:59:32

Are you sure you are using the updated version? I had a similar issue once, and in the end I discovered that the server was still using an older python version.

Roberto Liffredo 2009-07-28 20:08:38

I running on a CentOS machine where 2.4 is the default version. I installed 2.6 just to test this issue. I'm running my scripts explicitly calling python2.6.

Asif Rahman 2009-07-28 21:28:37

Answer 3

+1 A:

CPython implementation only exceptionally free's allocated memory. This is a widely known bug, but it isn't receiving much attention by CPython developers. The recommended workaround is to "fork and die" the process that consumes lots RAM.

2009-07-28 17:06:03

Is this the bug that Roberto is referring to that has been fixed in 2.5? Or is this bug known to still exist?

Asif Rahman 2009-07-28 18:14:43

hruske, are you really sure about what you are saying? I have no experience of such bug, and I have also no knowledge about it from the python bug tracker.

Roberto Liffredo 2009-07-28 20:06:09

From my experience, yes. For example, processing large amounts of data in a deamon will leave you afterwards with a large amount of malloc'd RAM, which will then after some time be swapped out and very likely not free'd. However, this may not be because of gc bug but rather because of a circular reference. Try checking here http://www.lshift.net/blog/2008/11/14/tracing-python-memory-leaks if it helps.

2009-07-31 15:07:20

Answer 4

A:

I've implemented hruske's advice of "fork and die". I'm using os.fork() to execute the memory intensive section of code in a child process, then I let the child process exit. The parent process executes an os.waitpid() on the child so that only one thread is executing at a given time.

If anyone sees any pitfalls with this solution, please chime in.

Asif Rahman 2009-07-28 19:35:36

ansaurus

tags:

views:

answers:

Python memory footprint vs. heap size

related questions