views:

961

answers:

19

Do you prefer malloc and free / new, delete, etc... or do you prefer your language to have garbage collection? Why and why not? Do you have any proof besides your own reasoning about why your method is better?

+17  A: 

Personally, I feel the benefits of GC are obvious and completely outweigh any benefit of doing it manually. Partially the reasoning is that a handful of incredibly intelligent people can get together and write a really smart GC, and it will probably perform (a lot) better than an average programmer trying to do memory management themselves.

Mark-and-sweep is obviously terrible, but some of the modern GCs are pretty fancy.

Claudiu
Seun Osewa
The problem is in a naïve implementation you have to stop the world and then do a collection and compaction. I think that is what the OP is referring to, modern GCs may also be compacting they don't require the jerkiness of a simple mark and sweep collector.
Ukko
+7  A: 

Like so many things, it's a tradeoff between CPU efficiency and human programmer efficiency. GC makes things measurably slower in most cases (see reference below), but in many (most?) domains these days we let fast hardware absorb that cost and develop featureful apps quickly using GC'd languages.

http://lambda-the-ultimate.org/node/2552

Brian
I agree with that. Hardware is so cheap these days that we can affort using GC, ORM and other stuff like that. Performance should not impact your design choice until it become a problem.
Frederic Morin
I think thats one habit we need to snap out of. Hardware will get us out of all binds. That is no reason to explain ineffeciences and badly written code.
HeretoLearn
+1, and great article
Fire Crow
A: 

GC is, IMO, a great way to concentrate on what we actually need to implement, rather than to spend time managing memory.

(now that is not entirely true, since memory leak can still happen with GC, due to persistent references... and we spend time tracking it ;-) )

Still, GC is better, and:

[Humor ON]

I have the final proof... in podcast 25.

[46:28]

Spolsky: [...] the ability to call functions recursively is something the average programmer uses about once a year. It's good to have, but it's not going to necessarily make a huge difference - that one time a year it does - but it's not as big a deal as garbage collection.

[...]

Spolsky: Once you assume garbage collection (or memory management as they call it) then you can do things that in COM you used to have to do manually, and they were a nightmare. COM had the right idea, it was just that it was impossible to ever keep your reference counts correct, and...

[Humor OFF]

VonC
+1  A: 

I'm not aware of a garbage collector which performs a lot better than manual memory management. I've heard that Lua does very well, at least for small programs, and obviously Java is well-tested and its performance in comparison with lower-level languages is compared ad infinitum. It's usually good enough for the purposes to which it is put, but relatively few video codecs are written in pure Java.

Furthermore, the issue is tied up with choice of language. If there are good reasons for writing your app in C++, then you will probably find that the same good reasons prevent you from garbage collecting everything. Of course, calling RAII and smart pointers "manual memory management" is overstating the case a little, but they aren't so automatic as GC.

In cases where there is very little dynamic memory to be managed (i.e. because the program runs for a long time with only a small number of allocations and frees), manual memory management makes it easier to prove real-time guarantees. Again, some garbage collectors may do that, but a lot don't, and hence would be useless for certain applications. In particular, I suspect that the real-time cost of an allocation shoots up if there's a chance that the GC may have to do work in order to make "available" memory actually available for reuse.

Finally, someone has to write the code on which a fiendishly powerful garbage collector depends: some part of any OS must have manual memory management, so it will never go away. Anyone who has worked on an OS implementation (me included) knows that there is no such thing as the mystical super-intelligent OS pixies who will solve all your problems for you ;-)

Steve Jessop
+6  A: 

Although in theory managing your own memory does offer advantages as you can tune the algorithm so it's optimized against your code, in practice apart from a few rare cases this should not be a consideration. Handling memory is a notoriously error prone and for even the best programmer it will add a significant overhead to development time to make sure they have got the code correct, efficient (otherwise why spin your own?) and error free,

The general trend in development is we create more powerful programs faster by utilizing support APIs to take away the routine and hard. Twenty years ago writing an application for a windowing system required you handle the main event loop yourself - nowadays that's abstracted away for you and you'd only touch it under exceptional circumstances. Memory management is exactly the same.

Cruachan
+1  A: 

You know, for most everything I write these days I simply don't want to be bothered with memory management. I'm trying to solve a real business problem, not track down where I cast some memory adrift when I reassigned a pointer. I was skeptical when I first started doing .NET development, coming from a c/c++ world where I managed all my own memory, but I really think I'm able to spend more of my time writing quality solutions related to the problem at hand.

itsmatt
A: 

I'm used to manual memory management so I don't have any problems with it. But I can imagine the time and frustration saved while using a Garbage Collection. So I think it is a excelent improvement.

Gamecat
+1  A: 

I think GC is the way to go because, whatever the case, it can always be refined and improved by new algorithm discoveries or better overall implementations down the road. Manual memory management is limited always to only the degree of quality that the individual programmer arsed themselves to muster at the time. Which could potentially by quite high, but that doesn't alter the fact that the general overhead in development time and effort to achieve that quality typically outweighs the circumstantial gains you usually get out of doing so.

Daddy Warbox
A: 

Having garbage collection is generally a good thing because it let's you free your cognitive resources for implementing your intention rather than concentrating to some unnecessary technical detail that could be delegated to AI. I still think that it's a good thing only if you know how stuff works under the hood. Not knowing how garbage collection and memory allocation works gets you in situations with unexplainable bugs.

Good case about how not knowing enough about the technology is the recent failed Princeton's DARPA contest participation.

JtR
A: 

I prefer reference counting systems, such as you get with Python or C++ smart pointers. It seems like a good compromise. You get most of the advantages of GC (you still have to watch out for circular references) and you get to keep the same level of performance without a large increase in your memory footprint.

Ferruccio
It's not the same level of performance -- keeping refcounts has a definite cost, as you have to keep touching memory that you otherwise wouldn't. In my opnion, the only advantage refcounting has over full GC is deterministic behavior, like manual memory management.
ephemient
Reference counting is incredibly slow and, even then, you can obtain determinism with a GC if it is necessary.
Jon Harrop
+5  A: 

For someone learning how to program, a language with manual MM is better. It teaches you to clean up after yourself (my mom always wanted me to learn that...)

Also, for runtime critical apps, the manual way may be better. That way GC can not kick in in a crucial moment and steal cpu time.

Apart from those two, I can not think of any advantage of manual MM. GC gives you higher stability of the application, and faster development. Talking of two birds with one stone!

Treb
back to basics! How does malloc work? http://www.joelonsoftware.com/articles/fog0000000319.html
Will
+5  A: 

I'm used to C++, so I use RAII, smart pointers and Standard containers, and I honestly can't remember the last time I coded a memory leak. The nice thing about the C++ model is that it generalizes well to non-memory resources. I'm constantly surprised by how many complex coding problems reduce to performing a sequence of steps - safely cleaning up work done so far if any should fail - and undoing them in reverse order.

I can live with GC if the language gives me some way to automatically clean up non-memory resources without a lot of syntax, e.g. passing a lambda to a function which handles the open/close. Lisp's with-open-file and Ruby's File.open(name) { proc } come to mind. Worst for me is Java, where my code ends up cluttered with try {} finally {} blocks.

fizzer
Out of curiosity, does the .NET IDisposable pattern fit your definition of "some way to automatically clean up non-memory resources without a lot of syntax"? In case you don't know, it's (for this purpose) like just "using (var file = new FileStream()) { /* doe something with file */ }".
OregonGhost
I don't know .NET, but after a quick Google, yes, it seems like using plus IDisposable fit the bill.
fizzer
+2  A: 

I particularly appreciate garbage collection when I need to store the same piece of data in multiple data structures.

With manual memory management, I always need to decide what data structure and/or piece of code is responsible for freeing a piece of data. Typically, when a piece of data is stored into a data structure, that data structure becomes responsible for freeing that piece of data when it is deleted from the data structure or when the data structure is freed. This means that if I need to store a piece of data in 5 data structures, I typically need to make 5 copies of the data!

With garbage collection, I can go ahead and store the same copy of the data in all of the data structures. When none of them need it any more, it will be freed.

Glomek
A: 

I program primarily in D, which is fully garbage collected (conservatively in current implementations, hopefully it will get precise GC when it's more mature), but also allows manual deletion. The strategy I generally adopt is to let the GC handle the small objects and stuff with non-trivial lifetimes, but to manage large objects with trivial lifetimes manually. I do lots of number crunching type work, so I create tons of huge temporary arrays that are bound to function scopes. I always delete them manually because otherwise I'd be losing too much performance to GC and losing too much space to false pointers. On the other hand, for small things like run of the mill strings and classes, I just let the GC handle them. Saves me tons of effort, and according to some studies done on the Hans Boehm GC, programs that allocate lots of small objects are faster with GC compared to manual memory management, though programs that allocate a few large objects are slower. Therefore, I feel this strategy gives me the best of both worlds. Of course, my domain (bioinformatics) has no real-time constraints.

dsimcha
+1  A: 

It depends on a) the programming language you are using and b) the project you are working on.

First the programming language you are using determines if precise garbage collection is possible or if you have to use a conservative garbage collector (Precise vs. conservative and internal pointers). Java is an example for a language for which a precise GC is possible, because the language has no pointers. D and C++ would be examples for languages for whom you would need a conservative GC because it is impossible to determine if a pointer/register is holding an object reference or not.

Like the name suggest, conservative GCs are not precise in detecting which memory is currently used and which isn't. Over the time and depending on your programming style and the used programming constructs they tend to over-allocate a lot of memory. So if you are developing a program which requires precise memory management a conservative GC should definitely not be your first choice. A precise GC or manual memory management would be much more appropriate.

On the other hand if you are developing a small tool/application which is relatively modest memory-wise and/or runs only a relatively short amount of time, any GC helps a lot to free you from the hassle of manual memory management.

Sascha
+2  A: 

I prefer, in order of preference:

  1. Use the stack. There are ways to avoid dynamic allocation for most trivial bookeeping tasks. Things are just way simpler if you can do this. Language choice helps here. In Ada you hardly ever need to use dynamic allocation. In C, it's tough to do much of anything without it.

  2. "Manual" dynamic allocation, but only during program initialization. Most systems will automaticly recover used heap space when the program terminates. So as long as you are not continually allocating new memory during runtime, it doesn't really matter.

  3. "Manual" dynamic allocation and deallocation, through constructors and destructors. This way you can at least debug any problems in only one place.

  4. "Manual" dynamic allocation and deallocation. YMMV on this one. However, I tend to do a lot of hard real-time work, so I need to know when everything is running. I can't have important things getting held up at semi-random times while a garbage collector runs.

  5. (last) Garbage Collection. It's OK for non-realtime work.
T.E.D.
+2  A: 

I think the generic question involves a false contrast, and many of the answers here provide more balance already. I wanted to come at this from a different angle to round things out a bit.

The first question is whether or not a given programming system even allows manual storage management and has efficient storage-structure accesss (structs and arrays and pointers that map to storage). And does it also allow automatically-managed storage where nothing is deleted so long as it can still be referenced too? Or not. If you car has a stick-shift and manual clutch, you can't drive it like it is an automatic (and vice versa).

There may be compelling reasons for using tools such as .NET, Java, JavaScript, Scheme, Common Lisp, etc., that don't really provide a model for manually- (that is, your program-) controlled storage allocation and release. So the opportunity doesn't even arise and there are tons of programming efforts that get along just fine. Programmatic management of storage usually doesn't arise, apart from some simple practices that can help the garbage collector and, in particular, reduce the risk of exhausting available storage.

With regard to the usual suspects (notably C/C++) that do provide for programmatic control and require it as the default, more or less (ignoring the C++/CLI dialect). IN that case, the question is really what conditions are there that make this "manual" control valuable?

For the most part, these platforms excel when most data can have its lifecycle tied to the lifecyle of the function that declares the data. This makes for very efficient memory management and a garbage collector isn't needed. If the exceptions are relatively few, with only-occasional spasms of allocation and freeing of heap memory, the avoidance of garbage collection may be very appealing in combination with running on the native hardware at native speeds. But if that's more performance than you absolutely require, going with a more-economical development and maintenance model is certainly preferable.

The most interesting case for me is when there is an allocation and freeing regime that is so intensely used that a custom memory-allocation approach is critical for performance reasons. Programs might need their own malloc-and-free equivalents (or even substitute the heap manager) that operate by grabbing large heap chunks and sub-allocating them privately because the efficiencies are so important. This can happen in compiler development and in specialized list-processing applications. An example that comes to mind for me has to do with maintenance of chess positions in custom data blocks in conjunction with a heuristic look-ahead search-for-move activity where rapid detection of duplicates is important. Tree-oriented, graph-oriented, queue-oriented, and stack-based processes may well gain from custom allocation methodologies for their specific data structures.

Finally, if one is working on tightly-constrained embedded systems, the use of custom storage management with at most reference counting may be the only supportable memory-management approach.

And, as we know, advances in hardware performance, software VMs, and other techniques will continue to shrink the roll-your-own cases into a relatively diminishing but possibly-important niche. It is important to be judicious in deciding when one can't wait for that and the investment in custom work is provides a benefit well beyond the added cost and complexity of the effort.

None of this matters if your tools give you no choice in how dynamic storage management will work and changing tools is of questionable benefit.

orcmid
A: 

To me, any algorithm or part of a program that is highly customized for performance/efficiency/capability should be customized as much as possible to the problem it is trying to solve.

with the increase in demand for, real-time-web applications, and high load web-services such as social websites, and mobile devices, performance issues have come back to the forefront of programming.

for this reason, I'm a big believer in custom C modules for the heavy lifting parts of an app.

such as shared objects for python, or my most recent endeavor of writing C modules for server side embedded javascript.

in these cases the basic scripting of the program is garbage collected, but the custom parts (parsing a file, running analysis across a dataset, lookup within custom criteria), all takes place in a custom C module which is manually memory managed.

I think this is the idea hybrid, where GC has its place for the parts of the application that will be thrown together, and manual MM has it's place where parts will be thoroughly engineered.

Fire Crow
A: 

I don’t really see the point of garbage collection at all. Since you should not/can not depend on finalizers to be run in a garbage-collected environment you have to manually close/deallocate all other resources that are not memory. If you do you might as well explicitly deallocate the memory. And if you don’t you have replaced your memory leak with another resource leak. And some resources like file handles might be more limited than memory, depending on the environment.

Sven
true but you forget all the cases where you have things that use memory that don't have resources associated with it, like variable-length arrays, linked lists, etc.
Claudiu