views:

1200

answers:

5

I know that a language like Python does garbage collection for you, but for any applications/optimizations or even security, is it important to understand how memory management works? I am very familiar, but I am going to start teaching a student soon, and I'm curious to know whether that's something we should go over.

Note: this question is relevant to Python, but answers in the context of other languages are welcome.

+7  A: 

Short answer no

You don't need to know about the GC implementation for security reasons unless you are working with weird data sizes. Even then the VM will catch any weirdness ... this is why it exists, so you don't need to know.

however There may be a case for optimization when working with large, iterative or recursive systems so that you preserve memory. But this is more about design of your program so that the GC can see things fall out of scope and keep the memory space nice and tidy. But you will often find that the GC doesn't behave in a uniform manner so you could argue that unless you are maxing your RAM this is futile for some garbage collection implementations (Java).

Unless you are teaching GC implementation or writing C libraries for Python I wouldn't worry to much about it, unless it starts to eat your memory to oblivion.

Aiden Bell
I do agree, the only thing I know about the garbage collection in Python is that it uses some kind of reference counting. I have never had the need to expand upon that knowledge.
Skurmedel
Where it might matter is if you do C interop, then you probably need to know. Otherwise, no.
Skurmedel
Often, unless you are implementing GC then it isn't worth finding out. Most GC implementations are fairly lame and *almost* behave on a whim.
Aiden Bell
@Skurmedel - Good point on the C stuff, added.
Aiden Bell
+3  A: 

In a word, "no".

There are some edge cases where understanding the behavior of the Python memory management regime are useful, maybe even important, but these are extremely advanced areas. You can write secure, high-performance Python applications without the slightest idea how its garbage collector works.

More importantly, if you are going to be teaching a student, it's much more important that you focus on a good understanding of the fundamentals than deep interpreter-internals stuff. Also, different python VMs have different garbage collectors, so learning about this too early would actually be a bad thing; if you learn about CPython garbage collection and then rely on tricks and fiddly details of its implementation, you'll find that your code won't work on PyPy, Jython, or IronPython. Whereas if you just write good Python code that assumes the garbage collector works as advertised but the specifics of its behavior are implementation-defined, your code will often just work.

Glyph
+1 Good thinking on the multi-VM front.
Aiden Bell
A: 

For certain speed optimizations, a basic understanding of how memory management works will definitely help. For instance, the following code creates the same list two different ways, but one is an order of magnitude faster:

import time

print 'single allocation'
start = time.time()
a=range(1000000)
end = time.time()

print end-start

print 'sequential allocation'
start=time.time()
b=[]
for i in range(1000000):
    b.append(i)

end=time.time()

print end-start

print b==a

Granted, this is a bit of a contrived example, but you get the idea.

Dan Lorenc
That won't really be an order of magnitude (only around twice as slow, at least if you avoid doing the same work twice by using xrange() instead of range), and the reasons for slowness aren't really down to memory management, just the fact that you're invoking fast C code in the first, and slower python code in the second. Lists proportionally overallocate their memory, so you won't run into O(n^2) behaviour here - you may be thinking of strings, which don't overallocate so could result in a reallocation and copy for every append (though this has been improved recently too).
Brian
+2  A: 

As most of the people said "No" in general you won't need it at least not at tutorial level, but once a real world app is written, it will have some memory problems, cyclic references and not garbage collected objects e.g. they are still being referred somewhere.

in that case to debug you may find gc module handy to debug what refers what and what is not being collected

Anurag Uniyal
python's garbage collection also includes a (i believe) mark-and-sweep phase to remove cyclic reference structures.
Autoplectic
yes i think it can remove cyclic references, but better way still is to use weakref , still reference can happen in unsuspecting places e.g. nested function using variable from enclosing scope
Anurag Uniyal
+1  A: 

You should warn about the dangers of __del__ special method I think. Basically you should avoid it if you can. If __del__ appears in circular references, CPython's GC won't free the resources.

Heikki Toivonen