ansaurus

Question

How can I explicitly free memory in Python?

Answer 1

+1 A:

Python is garbage-collected, so if you reduce the size of your list, it will reclaim memory. You can also use the "del" statement to get rid of a variable completely:

biglist = [blah,blah,blah]
#...
del biglist

Ned Batchelder 2009-08-22 19:14:30

This is and isn't true. While decreasing the size of the list allows the memory to be reclaimed, there is no guarantee when this will happen.

2009-08-22 19:39:10

No, but usually it will help. However, as I understand the question here, the problem is that he has to have so many objects that he runs out of memory before processing them all, if he reads them into a list. Deleting the list before he is done processing is unlikely to be a useful solution. ;)

Lennart Regebro 2009-08-22 19:48:04

Also note that del doesn't guarantee that an object will be deleted. If there are other references to the object, it won't be freed.

Jason Baker 2009-08-22 19:55:53

Wouldn't a low-memory/out-of-memory condition trigger an "emergency run" of the garbage collector?

Jeremy Friesner 2009-08-22 23:02:28

Answer 2

+4 A:

The del statement might be of use, but IIRC it isn't guaranteed to free the memory. The docs are here ... and a why it isn't released is here.

I have heard people on Linux and Unix-type systems forking a python process to do some work, getting results and then killing it.

This article has notes on the Python garbage collector, but I think lack of memory control is the downside to managed memory

Aiden Bell 2009-08-22 19:16:12

Would IronPython and Jython be another option to avoid this problem?

voyager 2009-08-22 19:23:38

@voyager: No, it wouldn't. And neither would any other language, really. The problem is that he reads in large amounts of data into a list, and the data is too large for the memory.

Lennart Regebro 2009-08-22 19:27:48

It would likely be *worse* under IronPython or Jython. In those environments, you're not even guaranteed the memory will be released if nothing else is holding a reference.

Jason Baker 2009-08-22 19:33:20

Answer 3

+7 A:

You can't explicitly free memory. What you need to do is to make sure you don't keep references to objects. They will then be garbage collected, freeing the memory.

In your case, when you need large lists, you typically need to reorganize the code, typically using generators/iterators instead. That way you don't need to have the large lists in memory at all.

http://www.prasannatech.net/2009/07/introduction-python-generators.html

Lennart Regebro 2009-08-22 19:16:44

If this approach is feasible, then it's probably worth doing. But it should be noted that you can't do random access on iterators, which may cause problems.

Jason Baker 2009-08-22 20:01:15

That's true, and if that is necessary, then accessing large data datasets randomly is likely to require some sort of database.

Lennart Regebro 2009-08-22 20:22:14

You can easily use an iterator to extract a random subset of another iterator.

S.Lott 2009-08-22 22:33:05

True, but then you would have to iterate through everything to get the subset, which will be very slow.

Lennart Regebro 2009-08-23 06:49:50

Answer 4

+9 A:

According to http://docs.python.org/library/gc.html you can force the Garbage Collector to release unreferenced memory with gc.collect()

Havenard 2009-08-22 19:18:27

nice [filling the lame 15 chars]

Aiden Bell 2009-08-22 19:29:33

Things are garbage collected frequently anyway, except in some unusual cases, so I don't think that will help much.

Lennart Regebro 2009-08-22 19:31:07

Maybe not frequently enough. He can be creating these "million objects" in matter of millisecounds.

Havenard 2009-08-22 19:36:17

Yeah, but the garbage collector is being run every 700th time you create an object, so it will have run thousands of times during those milliseconds.

Lennart Regebro 2009-08-22 19:46:07

Aiden, thank deity for the 15 char limit. Without it, there would be lots of lame comments like "lol" and "nice".

avakar 2009-08-22 19:48:30

lol

Lennart Regebro 2009-08-22 19:50:13

In general, gc.collect() is to be avoided. The garbage collector knows how to do its job. That said, if the OP is in a situation where he is suddenly deallocating a *lot* of objects (like in the millions), gc.collect may prove useful.

Jason Baker 2009-08-22 19:53:18

A better approach is to create smaller functions so each variable has a shorter lifetime between creation and being dereferenced when the namespace is removed at function exit.

S.Lott 2009-08-22 22:34:31

@avakar ... settle down. Sometimes a 'nice' is appropriate.

Aiden Bell 2009-08-23 11:53:41

Answer 5

+13 A:

Unfortunately (depending on your version and release of Python) some types of objects use "free lists" which are a neat local optimization but may cause memory fragmentation, specifically by making more an more memory "earmarked" for only objects of a certain type and thereby unavailable to the "general fund".

The only really reliable way to ensure that a large but temporary use of memory DOES return all resources to the system when it's done, is to have that use happen in a subprocess, which does the memory-hungry work then terminates. Under such conditions, the operating system WILL do its job, and gladly recycle all the resources the subprocess may have gobbled up. Fortunately, the multiprocessing module makes this kind of operation (which used to be rather a pain) not too bad in modern versions of Python.

In your use case, it seems that the best way for the subprocesses to accumulate some results and yet ensure those results are available to the main process is to use semi-temporary files (by semi-temporary I mean, NOT the kind of files that automatically go away when closed, just ordinary files that you explicitly delete when you're all done with them).

Alex Martelli 2009-08-22 19:21:21

Answer 6

A:

Gee - this is not really an answer - I am by no means a mathematician.

The only reason I write is that nobody else did ;-) Well in fact they just did while I was writing.

I think millions of objects is probably not so good - maybe better to have just a few and then put the hard cash (the numbers) into lists.

I believe NumPy should be good at number crushing, but I am by no means a mathematician!

2009-08-22 19:23:48

that's a good idea, except that the output will also contain millions of these objects, just in a different format

Nathan Fellman 2009-08-22 19:28:10

Answer 7

+1 A:

If you don't care about vertex reuse, you could have two output files--one for vertices and one for triangles. Then append the triangle file to the vertex file when you are done.

Nosredna 2009-08-22 19:29:19

I figure I can keep only the vertices in memory and print the triangles out to a file, and then print out the vertices only at the end. However, the act of writing the triangles to a file is a huge performance drain. Is there any way to speed *that* up?

Nathan Fellman 2009-08-23 18:25:49

Answer 8

+1 A:

Others have posted some ways that you might be able to "coax" the Python interpreter into freeing the memory (or otherwise avoid having memory problems). Chances are you should try their ideas out first. However, I feel it important to give you a direct answer to your question.

There isn't really any way to directly tell Python to free memory. The fact of that matter is that if you want that low a level of control, you're going to have to write an extension in C or C++.

That said, there are some tools to help with this:

Jason Baker 2009-08-22 19:47:09

Answer 9

+3 A:

Lists of numbers are much less memory-efficient than the format used by the standard array module or the third-party Numpy module. You would save memory by putting your vertices in a Numpy 3xN array and your triangles in an N-element array.

EOL 2009-08-22 21:29:57

ansaurus

tags:

views:

answers:

How can I explicitly free memory in Python?

related questions