views:

2209

answers:

14

I keep hearing people complaining that C++ doesn't have garbage collection. I also hear that the C++ Standards Committee is looking at adding it to the language. I'm afraid I just don't see the point to it... using RAII with smart pointers eliminates the need for it, right?

My only experience with garbage collection was on a couple of cheap eighties home computers, where it meant that the system would freeze up for a few seconds every so often. I'm sure it has improved since then, but as you can guess, that didn't leave me with a high opinion of it.

What advantages could garbage collection offer an experienced C++ developer?

+4  A: 

The committee isn't adding garbage-collection, they are adding a couple of features that allow garbage collection to be more safely implemented. Only time will tell whether they actually have any effect whatsoever on future compilers. The specific implementations could vary widely, but will most likely involve reachability-based collection, which could involve a slight hang, depending on how it's done.

One thing is, though, no standards-conformant garbage collector will be able to call destructors - only to silently reuse lost memory.

coppro
You're right, they aren't "adding garbage collection." I misread the article.
Head Geek
A: 

I don't know much about the future of C++ but reasons for garbage collection include:

(From mostly wikipedia)

  • Pointers to garbage data
  • Data that doesn't get freed after pointer is destroyed (memory leak)
  • Freeing memory that has already been freed

http://en.wikipedia.org/wiki/Automatic_garbage_collection

srand
+8  A: 

The short answer is that garbage collection is very similar in principle to RAII with smart pointers. If every piece of memory you ever allocate lies within an object, and that object is only referred to by smart pointers, you have something close to garbage collection (potentially better). The advantage comes from not having to be so judicious about scoping and smart-pointering every object, and letting the runtime do the work for you.

This question seems analogous to "what does C++ have to offer the experienced assembly developer? instructions and subroutines eliminate the need for it, right?"

Matt J
<chuckle> Point taken.
Head Geek
Glad it was taken in the spirit in which it was intended. I tend towards more manual methods myself :-)
Matt J
If you are using reference-counted smart pointers, beware of reference loops. One of the advantages of garbage collection is that it isn't confused by reference loops.
CesarB
If you make proper use of boost::weak_ptr, reference loops aren't a problem.
Head Geek
except that smart pointers free resources the moment they go out of scope, a GC can have them hang around for ages. It can make a big difference.
gbjbaanb
@Head Geek - if you make proper use of assembler language, etc. (see last paragraph of Matt J's answer).
Daniel Earwicker
+5  A: 

With the advent of good memory checkers like valgrind, I don't see much use to garbage collection as a safety net "in case" we forgot to deallocate something - especially since it doesn't help much in managing the more generic case of resources other than memory (although these are much less common). Besides, explicitly allocating and deallocating memory (even with smart pointers) is fairly rare in the code I've seen, since containers are a much simpler and better way usually.

But garbage collection can offer performance benefits potentially, especially if alot of short lived objects are being heap allocated. GC also potentially offers better locality of reference for newly created objects (comparable to objects on the stack).

Greg Rogers
Greg, could you expand a little on your last paragraph? I was thought this was the job of any memory allocator - even malloc - not just garbage collectors (which essentially figure out when to call free() for you). But I am no pro on this, would love a more detailed explanation.
SquareCog
One big potential performance advantage of gc is that you can allocate/free in one pass instead of many. It really depends on the situation: in some environments, manual memory allocation or RAII with custom allocators may be easier to handle than gc.
David Cournapeau
+1  A: 

I, too, have doubts that C++ commitee is adding a full-fledged garbage collection to the standard.

But I would say that the main reason for adding/having garbage collection in modern language is that there are too few good reasons against garbage collection. Since eighties there were several huge advances in the field of memory management and garbage collection and I believe there are even garbage collection strategies that could give you soft-real-time-like guarantees (like, "GC won't take more than .... in the worst case").

ADEpt
The real time argument is moot anyway, because malloc/free do not have worst case guarantee either.
David Cournapeau
+33  A: 

I keep hearing people complaining that C++ doesn't have garbage collection.

I am so sorry for them. Seriously.

C++ has RAII, and I always complain to find no RAII (or a castrated RAII) in Garbage Collected languages.

What advantages could garbage collection offer an experienced C++ developer?

Another tool.

Matt J wrote it quite right in his post (http://stackoverflow.com/questions/228620/garbage-collection-in-c-why#228640): We don't need C++ features as most of them could be coded in C, and we don't need C features as most of them could coded in Assembly, etc.. C++ must evolve.

As a developper: I don't care about GC. I tried both RAII and GC, and I find RAII vastly superior. As said by Greg Rogers in his post (http://stackoverflow.com/questions/228620/garbage-collection-in-c-why#228670), memory leaks are not so terrible (at least in C++, where they are rare if C++ is really used) as to justify GC instead of RAII. GC has non deterministic deallocation/finalization and is just a way to write a code that just don't care with specific memory choices.

This last sentence is important: It is important to write code that "juste don't care". In the same way in C++ RAII we don't care about ressource freeing because RAII do it for us, or for object initialization because constructor do it for us, it is sometimes important to just code without caring about who is owner of what memory, and what kind pointer (shared, weak, etc.) we need for this or this piece of code. There seems to be a need for GC in C++. (even if I personaly fail to see it)

An example of good GC use in C++

Sometimes, in an app, you have "floating data". Imagine a tree-like structure of data, but no one is really "owner" of the data (and no one really cares about when exactly it will be destroyed). Multiple objects can use it, and then, discard it. You want it to be freed when no one is using it anymore.

The C++ approach is using a smart pointer. The boost::shared_ptr comes to mind. So each piece of data is owned by its own shared pointer. Cool. The problem is that when each piece of data can refer to another piece of data. You cannot use shared pointers because they are using a reference counter, which won't support circular references (A points to B, and B points to A). So you must know think a lot about where to use weak pointers (boost::weak_ptr), and when to use shared pointers.

With a GC, you just use the tree structured data.

The downside being that you must not care when the "floating data" will really be destroyed. Only that it will be destroyed.

Conclusion

So in the end, if done properly, and compatible with the current idioms of C++, GC would be a Yet Another Good Tool for C++.

C++ is a multiparadigm language: Adding a GC will perhaps make some C++ fanboys cry because of treason, but in the end, it could be a good idea, and I guess the C++ Standards Comitee won't let this kind of major feature break the language, so we can trust them to make the necessary work to enable a correct C++ GC that won't interfere with C++: As always in C++, if you don't need a feature, don't use it and it will cost you nothing.

paercebal
The one think I hope we don't get is (Java Like) Phoenix objects. Were the finalizer can make the object live again. But the second time it is garbage collected the finalizer is not run.
Martin York
As I understood, the C++09 would _facilitate_ garbage collection.
xtofl
http://www.artima.com/cppsource/cpp0x.html, an article by B. Stroustrup: "C++0x will most likely support optional garbage collection"
paercebal
Now, from Wikipedia: http://en.wikipedia.org/wiki/C%2B%2B0x#Transparent_garbage_collection : "Full garbage collection support has been remanded to a later version of the standard or a Technical Report." So I guess you're right. :-)
paercebal
excellent comment. two upvotes for that :)
Johannes Schaub - litb
C++'s RAII is limited an could be better.
Tim Matthews
xtofl: As I understood it, the newer standards of C++ would require C++ to facilitate GC /for_memory_alone/ - effectively nothing changes in C++, except that the memory for the object is not actually released.
Arafangion
@Ctrl Alt D-1337 : Could you give us some examples of "C++'s RAII is limited an could be better" ? Is there a language with a better RAII ?
paercebal
@paercebal - try/finally is useful sometimes (but can now be simulated with lambdas), but aside from that C++'s RAII would be hard to improve on.
Daniel Earwicker
More to the point, you need both RAII and GC. One doesn't preclude the other. Most languages with GC baked into them from the start also have RAII-like idioms as well. And you may think you don't *need* GC, but who would honestly reject greater convenience and higher productivity, if available? Pervasive GC makes you design and code in a different way, and your productivity rises. Another often ignored advantage is that it often performs better as well!
Daniel Earwicker
@Earwicker: The major languages with GC I know (i.e. non-script non-niche languages) are Java and C#. Java has no RAII whatsoever, and C#'s RAII is far from satisfying when coming from C++. Still, we have a common viewpoint: If we can afford it, working in a language where memory allocation is handled in the background saves a lot of time.
paercebal
@Earwicker: You're right about the try/finally: Sometimes, we want code to execute no matter how we exit the scope, and writting a local struct just to have its destructor do the cleaning is painful. Another solution is to use Boost.ScopedExit, at http://www.boost.org/doc/libs/1_39_0/libs/scope_exit/doc/html/index.html ...
paercebal
+2  A: 

Garbage collection is really the basis for automatic resource management. And having GC changes the way you tackle problems in a way that is hard to quantify. For example when you are doing manual resource management you need to:

  • Consider when an item can be freed (are all modules/classes finished with it?)
  • Consider who's responsibility it is to free a resource when it is ready to be freed (which class/module should free this item?)

In the trivial case there is no complexity. E.g. you open a file at the start of a method and close it at the end. Or the caller must free this returned block of memory.

Things start to get complicated quickly when you have multiple modules that interact with a resource and it is not as clear who needs to clean up. The end result is that the whole approach to tackling a problem includes certain programming and design patterns which are a compromise.

In languages that have garbage collection you can use a disposable pattern where you can free resources you know you've finished with but if you fail to free them the GC is there to save the day.


Smart pointers which is actually a perfect example of the compromises I mentioned. Smart pointers can't save you from leaking cyclic data structures unless you have a backup mechanism. To avoid this problem you often compromise and avoid using a cyclic structure even though it may otherwise be the best fit.

Luke Quinane
The problem is that the disposable pattern won't save you in all cases. In C#, the diposable pattern is a pain to implement correctly (as the finalizer can be called multiple times by different threads, etc.), and in Java, the "disposable" pattern is a joke.
paercebal
And again, proper use of smart pointers eliminates both of the problems you mention.
Head Geek
Proper use of the Boost::weak_ptr can eliminate problems with cyclic data structures too. It requires a full understanding of how your code works, but you should really have that kind of understanding regardless.
Head Geek
@Head Geek: Sometimes, you just don't want to care about some part of your code, in the same way you just don't want to care how the std::string allocates/frees its internal string. You want the data to be there as long as you use it, whatever how, and cleaned away when not used anymore.
paercebal
Finally, it should be the responsibility of the class designer to decide how it is to be free'ed, than to have the user check the implementation/documentation of the class.
Arafangion
+6  A: 

Garbage collection allows to postpone the decision about who owns an object. With RAII, indeed, objects are recollected when going out of scope. This is sometimes referred to as "immediate GC".

The tricky thing about GC is deciding upon when an object is no longer needed.

xtofl
Smart pointers completely eliminate the need to decide who owns an object.
Head Geek
@Head Geek : Not exactly. If you have 2 objects, A and B, pointed through smart pointers, you're right. Now, if A points to B, too, and if B points to A, too, then you have a problem, and must decide who owns the object through the use of weak_ptr and/or shared_ptr.
paercebal
+4  A: 

I don't understand how one can argue that RAII replaces GC, or is vastly superior. There are many cases handled by a gc that RAII simply cannot deal with at all. They are different beasts.

First, RAII is not bullet proof: it works against some common failures which are pervasive in C++, but there are many cases where RAII does not help at all; it is fragile to asynchronous events (like signals under UNIX). Fundamentally, RAII relies on scoping: when a variable is out of scope, it is automatically freed (assuming the destructor is correctly implemented of course).

Here is a simple example where neither auto_ptr or RAII can help you:

#include <signal.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

#include <memory>

using namespace std;

volatile sig_atomic_t got_sigint = 0;

class A {
        public:
                A() { printf("ctor\n"); };
                ~A() { printf("dtor\n"); };
};

void catch_sigint (int sig)
{
        got_sigint = 1;
}

/* Emulate expensive computation */
void do_something()
{
        sleep(3);
}

void handle_sigint()
{
        printf("Caught SIGINT\n");
        exit(EXIT_FAILURE);
}

int main (void)
{
        A a;
        auto_ptr<A> aa(new A);

        signal(SIGINT, catch_sigint);

        while (1) {
                if (got_sigint == 0) {
                        do_something();
                } else {
                        handle_sigint();
                        return -1;
                }
        }
}

The destructor of A will never be called. Of course, it is an artificial and somewhat contrived example, but a similar situation can actually happen; for example when your code is called by another code which handles SIGINT and which you have no control over at all (concrete example: mex extensions in matlab). It is the same reason why finally in python does not guarantee execution of something. Gc can help you in this case.

Other idioms do not play well with this: in any non trivial program, you will need stateful objects (I am using the word object in a very broad sense here, it can be any construction allowed by the language); if you need to control the state outside one function, you can't easily do that with RAII (which is why RAII is not that helpful for asynchronous programming). OTOH, gc have a view of the whole memory of your process, that is it knows about all the objects it allocated, and can clean asynchronously.

It can also be much faster to use gc, for the same reasons: if you need to allocate/deallocate many objects (in particular small objects), gc will vastly outperform RAII, unless you write a custom allocator, since the gc can allocate/clean many objects in one pass. Some well known C++ projects use gc, even where performance matter (see for example Tim Sweenie about the use of gc in Unreal Tournament: http://lambda-the-ultimate.org/node/1277). GC basically increases throughput at the cost of latency.

Of course, there are cases where RAII is better than gc; in particular, the gc concept is mostly concerned with memory, and that's not the only ressource. Things like file, etc... can be well handled with RAII. Languages without memory handling like python or ruby do have something like RAII for those cases, BTW (with statement in python). RAII is very useful when you precisely need to control when the ressource is freed, and that's quite often the case for files or locks for example.

David Cournapeau
"gc has a view of your entire process" - 99.99 of which is NOT a pointer to resource X. That's why RAII is good; it statically limits the number of places where relevant pointers could hide.And RAII also lets you control memory mgmt directly - just assign NULL to a smart pointer.
MSalters
You misunderstand what I mean: the fact that the gc can view the whole process memory and is not scope limited means it can free memory asynchronously, and free several objects "at once" (in one pass). The fact that RAII statically limit the scope of the ressources is as much a problem as a feature.
David Cournapeau
It's closer to the truth to say that there are many cases RAII can handle that garbage collection cannot. GC concentrates on memory; RAII handles any kind of resource. And as far as I can tell, smart pointers eliminate your "fragility" argument.
Head Geek
If you wrap your resources in objects that can be collected then GC can handle resource management too. If you back this up with a disposable pattern then you can free resources you know you are finished with.
Luke Quinane
@Head geek: how does smart pointer handle signals, for example ? For memory, gc is much better than RAII in most cases. Almost every high level language uses gc, I think that's telling something.
David Cournapeau
@Quinane: depending on the ressources, you want deterministic freeing. Typically, for files (or locks; although I think RAII does not work taht well for threads either), you want to control exactly when you free the ressource.
David Cournapeau
I added an example using signal where neither smart pointer or RAII frees the ressource.
David Cournapeau
@cournape: I'm sorry, but that example seems pretty bogus. Calling exit() in a signal handler wouldn't allow garbage collection to clean anything up either.
Head Geek
And RAII works quite well for thread locks. I use it for that reason regularly; in fact, that's the situation that introduced me to RAII as a concept.
Head Geek
@head geek: the example shows that RAII can fail, and that's its only intent. Of course, you would never use this in real code. But not returning to the callee after sigint happens in real code: think about your code being called by other code you can't control at all and which handles sigint itself
David Cournapeau
RAII can help for thread lock, I agree, but is no panacea. I guess I am really concerned with the idea that RAII is a miracle solution, which magically prevents deadlock, memory leak, etc... It is definitely useful, but it is not a magic stick.
David Cournapeau
RAII has its flaws, without a doubt. But I don't see GC as solving them, it just swaps one set of flaws for another.
Head Geek
Yes, they have different flaws, that's called a trade-off :) But gc solves problems that RAII cannot solve (out of scope persistence, as in my example, assuming its does something else than exciting right away), it is a very useful tool. Now, I am not sure it would be that useful for C++.
David Cournapeau
hmm, your example is just as bad with GC, except with GC even if the object was cleaned up at exit, it still wouldn't get its finaliser called (as the finalisation thread runs the 2nd time its collected).
gbjbaanb
+3  A: 

What advantages could garbage collection offer an experienced C++ developer?

Not having to chase down resource leaks in your less-experienced colleagues' code.

JohnMcG
Resource leaks simply can't happen if you insist that everyone use RAII and smart pointers.
Head Geek
But establishing and enforcing those as rules has a cost, and just having them as guidelines does not mean they will always be followed.
JohnMcG
And it is quite easy to leak memory when using RAII and smart pointer anyway. For example, a signal handler which changes the code path and never returns to the callee: neither RAII or smart pointer will help you in that case.
David Cournapeau
Ressource Aquisition Is Initialization: http://en.wikipedia.org/wiki/Resource_acquisition_is_initialization
David Cournapeau
For what its worth, even a smart mark and sweep garbage collector like .NET's can't prevent all resource leaks.
FlySwat
+6  A: 

The motivating factor for GC support in C++ appears to be lambda programming, anonymous functions etc. It turns out that lambda libraries benefit from the ability to allocate memory without caring about cleanup. The benefit for ordinary developers would be simpler, more reliable and faster compiling lambda libraries.

GC also helps simulate infinite memory; the only reason you need to delete PODs is that you need to recycle memory. If you have either GC or infinite memory, there is no need to delete PODs anymore.

MSalters
In other words, it's merely a crutch for inexperienced programmers? :-)
Head Geek
Only if you consider functional programming as something for inexperienced programmers, yes. gc is an extremely powerful tool, with a cost: like all powerful abstractions, it enables focusing on the problem at hand, but sometimes, it breaks and you have to go below the abstraction.
David Cournapeau
+2  A: 

It's an all-to-common error to assume that because C++ does not have garbage collection baked into the language, you can't use garbage collection in C++ period. This is nonsense. I know of elite C++ programmers who use the Boehm collector as a matter of course in their work.

tragomaskhalos
Yes, I've seen several add-on garbage collection libraries. I just don't see why they're necessary or desirable in most cases. The answers to this question gave me a (very) few cases where having it might be desirable, which is why I asked.
Head Geek
+1  A: 

I like RAII and smart pointers.

1. I don't like to declare types whenever I use a smart pointer.
2. I want to be able to collect intermediate values so I can write compact code.

I designed GCPtr class specifically for the case of intermediate values:

//  gcptr.h
//  Created by "tydok" on 18-03-2009

#ifndef GARBAGE_COLLECT_POINTER_H__
#define GARBAGE_COLLECT_POINTER_H__

/* Syntactic sugar for automatic objects.
   Instead of GCPtr()(pointer) -> GCPTR(pointer)  */
#define GCPTR   GCPtr()

class GCPtr {
  // Prevent copying and assignment; not implemented
  GCPtr( const GCPtr& );
  GCPtr* operator=( const GCPtr& );

  void* _ptr;
  void* (GCPtr::*_pdel)(void*);

public:
  GCPtr() {
    _ptr = 0;
    _pdel = 0;
  }
  ~GCPtr() {
    if (_pdel && _ptr) (this->*_pdel)(0);
  }

  template<typename T>
  T operator()(T ptr) {
    if (ptr != 0) goto NORMAL;
    if (_ptr) delete (T)_ptr;
    _ptr = 0;
    return 0;
    NORMAL:
    _pdel = (void* (GCPtr::*)(void*))((T (GCPtr::*)(T))operator());
    _ptr = ptr; // <--- Any previous _ptr value will be lost
    return (T)(_ptr);
  }
};

#endif

Usage example:

#include <malloc.h>
#include <string.h>
#include <stdio.h>
#include "gcptr.h"

void
print_reversed_string( char* s )
{
  // Version 1, the usual way
  char* rs = _strrev(_strdup(s));
  printf("%s\n", rs);
  free(rs);

  // Version 2, using GCPtr
  // The allocated string will be deleted right after
  // printf is done.
  printf("%s\n", GCPTR(_strrev(_strdup(s))));
  // or
  //printf("%s\n", _strrev(GCPTR(_strdup(s))));
}

GCPtr's mechanism is simple. It stores the passed in pointer until it is time for the GCPtr object to be destroyed and it is when the pointer will be deallocated.

GCPtr works for objects allocated with malloc and new. You can, of course, use as many auto GCPtr objects in function calls as you need.

Nick D
+2  A: 

There is one property of GC which may be very important in some scenarios. Assignment of pointer is naturally atomic on most platforms, while creating thread-safe reference counted ("smart") pointers is quite hard and introduces significant synchronization overhead. As a result, smart pointers are often told "not to scale well" on multi-core architecture.

Suma
That's a valid point, though not one I'd normally be worried about. When I do multithreaded programming, the threads rarely share their data structures.
Head Geek