views:

6417

answers:

8

I'm not asking this question because of the merits of garbage collection first of all. My main reason for asking this is that I do know that Bjarne Stroustrup has said that C++ will have a garbage collector at some point in time.

With that said, why hasn't it been added? There are already some garbage collectors for C++. Is this just one of those "easier said than done" type things? Or are there other reasons it hasn't been added (and won't be added in C++0x)?

Cross links:

EDIT: Just to clarify, I understand the reasons why C++ didn't have a garbage collector when it was first created. I'm wondering why the collector can't be added in.

+24  A: 

What type? should it be optimised for embedded washing machine controllers, cell phones, workstations or supercomputers?
Should it prioritise gui responsiveness or server loading?
should it use lots of memory or lots of CPU?

C/c++ is used in just too many different circumstances. I suspect something like boost smart pointers will be enough for most users

Edit - Automatic garbage collectors aren't so much a problem of performance (you can always buy more server) it's a question of predicatable performance.
Not knowing when the GC is going to kick in is like employing a narcoleptic airline pilot, most of the time they are great - but when you really need responsiveness!

Martin Beckett
I definitely see your point, but I feel compelled to ask: isn't Java used in just about as many applications?
Jason Baker
No. Java is not suitable for high performance applications, for the simple reason that it doesn't have performance guarantees to the same extent as C++. So you'll find it in a cell phone, but you won't find it in a cell switch or supercomputer.
Zathrus
You can always buy more server, but you can't always buy more CPU for the cell phone already in the customer's pocket!
Crashworks
+28  A: 

Implicit garbage collection could have been added in, but it just didn't make the cut. Probably due to not just implementation complications, but also due to people not being able to come to a general consensus fast enough.

A quote from Bjarne Stroustrup himself (source):

I had hoped that a garbage collector which could be optionally enabled would be part of C++0x, but there were enough technical problems that I have to make do with just a detailed specification of how such a collector integrates with the rest of the language, if provided. As is the case with essentially all C++0x features, an experimental implementation exists.

Please see the above link for a more detailed discussion on why GC is hard.

There is also a good discussion of the topic here.

General overview:

C++ is very powerful and allows you to do almost anything. For this reason it doesn't automatically push many things onto you that might impact performance. Garbage collection can be easily implemented with smart pointers (objects that wrap pointers with a reference count, which auto delete themselves when the reference count reaches 0).

C++ was built with competitors in mind that did not have garbage collection. Efficiency was the main concern that C++ had to fend off criticism from in comparison to C and others.

There are 2 types of garbage collection...

Explicit garbage collection:

C++0x will have garbage collection via pointers created with shared_ptr

If you want it you can use it, if you don't want it you aren't forced into using it.

You can currently use boost:shared_ptr as well if you don't want to wait for C++0x.

Implicit garbage collection:

It does not have transparent garbage collection though. It will be a focus point for future C++ specs though.

Why Tr1 doesn't have implicit garbage collection?

There are a lot of things that tr1 of C++0x should have had, Bjarne Stroustrup in previous interviews stated that tr1 didn't have as much as he would have liked.

Brian R. Bondy
+4  A: 

The idea behind C++ was that you would not pay any performance impact for features that you don't use. So adding garbage collection would have meant having some programs run straight on the hardware the way C does and some within some sort of runtime virtual machine.

Nothing prevents you from using some form of smart pointers that are bound to some third-party garbage collection mechanism. I seem to recall Microsoft doing something like that with COM and it didn't go to well.

Uri
I don't think GC requires a VM. The compiler could add code to all pointer operations to update a global state, while a separate thread runs in the background deleting objects as needed.
I agree. You don't need a virtual machine, but the second you start having something manage your memory for you like that in the background, my feel is that you've left the actual "electric wires" and have sort of a VM situation.
Uri
+4  A: 

If you want automatic garbage collection, there are good commercial and public-domain garbage collectors for C++. For applications where garbage collection is suitable, C++ is an excellent garbage collected language with a performance that compares favorably with other garbage collected languages. See The C++ Programming Language (3rd Edition) for a discussion of automatic garbage collection in C++. See also, Hans-J. Boehm's site for C and C++ garbage collection. Also, C++ supports programming techniques that allows memory management to be safe and implicit without a garbage collector.

Source: http://www.research.att.com/~bs/bs_faq.html#garbage-collection

As for why it doesnt have it built in, If I remember correctly it was invented before GC was the thing, and I don't believe the language could of had GC for several reasons(I.E Backwards compatibilty with C)

Hope this helps.

Rayne
+3  A: 

To answer most "why" questions about C++, read Design and Evolution of C++

Nemanja Trifunovic
+13  A: 

One of the biggest reasons that C++ doesn't have built in garbage collection is that getting garbage collection to play nice with destructors is really, really hard. As far as I know, nobody really knows how to solve it completely yet. There are alot of issues to deal with:

  • deterministic lifetimes of objects (reference counting gives you this, but GC doesn't. Although it may not be that big of a deal).
  • what happens if a destructor throws when the object is being garbage collected? Most languages ignore this exception, since theres really no catch block to be able to transport it to, but this is probably not an acceptable solution for C++.
  • How to enable/disable it? Naturally it'd probably be a compile time decision but code that is written for GC vs code that is written for NOT GC is going to be very different and probably incompatible. How do you reconcile this?

These are just a few of the problems faced.

Greg Rogers
GC and destructors is a solved problem, by a nice sidestep from Bjarne. Destructors don't run during GC, because that's not the point of GC. GC in C++ exists to create the notion of infinite _memory_, not infinite other resources.
MSalters
If destructors don't run that completely changes the semantics of the language. I guess at the very least you'd need a new keyword "gcnew" or something so that you explicitly allow this object to be GC'ed (and therefore you shouldn't use it to wrap resources besides memory).
Greg Rogers
+12  A: 

To add to the debate here.

There are known issues with garbage collection, and understanding them helps understanding why there is none in C++.

1. Performance ?

The first complaint is often about performance, but most people don't really realize what they are talking about. As illustrated by Martin Beckett the problem may not be performance per se, but the predictability of performance.

There are currently 2 families of GC that are widely deployed:

  • Mark-And-Sweep kind
  • Reference-Counting kind

The Mark And Sweep is faster (less impact on overall performance) but it suffers from a "freeze the world" syndrom: ie when the GC kicks in, everything else is stopped until the GC has made its cleanup. If you wish to build a server that answers in a few milliseconds... some transactions will not live up to your expectations :)

The problem of Reference Counting is different: reference-counting adds overhead, especially in Multi-Threading environments because you need to have an atomic count. Furthermore there is the problem of reference cycles so you need a clever algorithm to detect those cycles and eliminate them (generally implement by a "freeze the world" too, though less frequent). In general, as of today, this kind (even though normally more responsive or rather, freezing less often) is slower than the Mark And Sweep.

I have a seen a paper by Eiffel implementers that were trying to implement a Reference Counting Garbage Collector that would have a similar global performance to Mark And Sweep without the "Freeze The World" aspect. It required a separate thread for the GC (typical). The algorithm was a bit frightening (at the end) but the paper made a good job of introducing the concepts one at a time and showing the evolution of the algorithm from the "simple" version to the full-fledged one. Recommended reading if only I could put my hands back on the PDF file...

2. Resources Acquisition Is Initialization

It's a common idiom in C++ that you will wrap the ownership of resources within an object to ensure that they are properly released. It's mostly used for memory since we don't have garbage collection, but it's also useful nonetheless for many other situations:

  • locks (multi-thread, file handle, ...)
  • connections (to a database, another server, ...)

The idea is to properly control the lifetime of the object:

  • it should be alive as long as you need it
  • it should be killed when you're done with it

The problem of GC is that if it helps with the former and ultimately guarantees that later... this "ultimate" may not be sufficient. If you release a lock, you'd really like that it be released now, so that it does not block any further calls!

Languages with GC have two work arounds:

  • don't use GC when stack allocation is sufficient: it's normally for performance issues, but in our case it really helps since the scope defines the lifetime
  • using construct... but it's explicit (weak) RAII while in C++ RAII is implicit so that the user CANNOT unwittingly make the error (by omitting the using keyword)

3. Smart Pointers

Smart pointers often appear as a silver bullet to handle memory in C++. Often times I have heard: we don't need GC after all, since we have smart pointers.

One could not be more wrong.

Smart pointers do help: auto_ptr and unique_ptr use RAII concepts, extremely useful indeed. They are so simple that you can write them by yourself quite easily.

When one need to share ownership however it gets more difficult: you might share among multiple threads and there are a few subtle issues with the handling of the count. Therefore, one naturally goes toward shared_ptr.

It's great, that's what Boost for after all, but it's not a silver bullet. In fact, the main issue with shared_ptr is that it emulates a GC implemented by Reference Counting but you need to implement the cycle detection all by yourself... Urg

Of course there is this weak_ptr thingy, but I have unfortunately already seen memory leaks despite the use of shared_ptr because of those cycles... and when you are in a Multi Threaded environment, it's extremely difficult to detect!

4. What's the solution ?

There is no silver bullet, but as always, it's definitely feasible. In the absence of GC one need to be clear on ownership:

  • prefer having a single owner at one given time, if possible
  • if not, make sure that your class diagram does not have any cycle pertaining to ownership and break them with subtle application of weak_ptr

So indeed, it would be great to have a GC... however it's no trivial issue. And in the mean time, we just need to roll up our sleeves.

Matthieu M.
I wish I could accept two answers! This is just great. One thing to point out, in regards to performance, the GC that runs in a separate thread is actually pretty common (it's used in Java and .Net). Granted, that might not be acceptable in embedded systems.
Jason Baker
thanks, very useful explanation.
bialix
Only two types? How 'bout copying collectors? Generational collectors? Assorted concurrent collectors (including Baker's hard real-time treadmill)? Various hybrid collectors? Man, the sheer ignorance in the industry of this field astonishes me sometimes.
JUST MY correct OPINION
Did I say there were only 2 types ? I said that there were 2 that were widely deployed. As far as I know Python, Java and C# all use Mark and Sweep algorithms now (Java used to have a reference counting algorithm). To be even more precise, it seems to me than C# uses Generational GC for minor cycles, Mark And Sweep for major cycles and Copying to fight off memory fragmentation; though I would argue that the heart of the algorithm is Mark And Sweep. Do you know any mainstream language that uses another technology ? I'm always happy to learn.
Matthieu M.
You just named one mainstream language that used three.
JUST MY correct OPINION
+6  A: 

I'll probably get voted down for this. It's a joke about C++ :D

Q: Why doesn't C++ have Garbage Collection?

A: Because then there would be nothing left.

Brock Woolf
yep, it's a wicked joke :-D
bialix