views:

1043

answers:

10

So if I understand well, Garbage collection automatically deallocates objects that are not used by the program anymore. like the garbage collector in java.

I hear in languages like C that don't support garbage collection the programs can have memory leaks and subsequently exhaust the memory.

So what are the errors that programmer make in languages like C that don't support garbage collection? I would guess not deallocating objects after they're not used anymore. But are these the only errors that we can make because of the lack of a garbage collector?

+7  A: 

Well, the errors you can make are:

  • Not deallocating things you don't need
  • Deallocating things you do need

There are other errors you can make, but those are the ones that relate specifically to garbage collection.

Noon Silk
+2  A: 

In C, you have to manually call free on memory allocated with malloc. While this doesn't sound so bad, it can get very messy when dealing with separate data structures (like linked lists) that point to the same data. You could end up accessing freed memory or double-freeing memory, both of which cause errors and can introduce security vulnerabilities.

Additionally, in C++, you need to be careful of mixing new[]/delete and new/delete[].

For example, memory management is something that requires the programmer to know exactly why

const char *getstr() { return "Hello, world!" }

is just fine but

const char *getstr() {
    char x[BUF_SIZE];
    fgets(x, BUF_SIZE, stdin);
    return x;
}

is a very bad thing.

Andrew Keeton
Keep in mind any mature C++ programmer shudders at the use of raw memory. Wrap it up, do away with it.
GMan
+3  A: 

In addition to what silky says you can also double deallocate something.

Alex Gaynor
+17  A: 
  • Dellocating things you need

  • Not deallocating things you no longer need (because you're not tracking allocations/use/frees well)

  • Re-allocating new instances of things that already exist (a side-effect of not tracking properly)

  • De-allocating something you've already freed

  • De-allocating something that doesn't exist (a null pointer)

There are probably more. The point is: managing memory is tricky, and is best dealt with using some sort of tracking mechanism and allocating/freeing abstraction. As such, you might as well have that built into your language, so it can make it nice and easy for you. Manual memory management isn't the end of the world -- it's certainly doable -- but these days, unless you're writing real-time code, hardware drivers, or (maybe, possibly) the ultra-optimised core code of the latest game, then manual effort isn't worth it, except as an academic exercise.

Lee B
Oh, one more issue: manually managing array length (resizing the array as you load more items to put into it etc.) is tedious at best. Increasing by one item each time is inefficient, so you tend to have to start tracking slots used, actual slots allocated, slots needed, etc. And you really don't want to have to start de-allocating stuff from the middle of that array and then shrinking it. Compared to a modern language like python's effortless arrays and dicts, it's really low-level stuff.
Lee B
In reference to your last point. De-allocating NULL is completely fine. `free(NULL)` is guaranteed to be completely safe by the standard.
Evan Teran
Low level languages like C are for a LOT more than just ultra-optimised code or drivers. Some things are just more efficiently done in C and it's still for a reason the most popular language ( at least for OSS projects ).
kmm
@kmm: Of course, but then all of those things are more efficiently done in hand-crafted assembly that bypasses the OS, by experts who know the processor, ram, chipset, hard drive geometry, etc. inside-out. The question is not whether it's more efficient, but whether it's reasonable to spend the extra time tracking all that, for relatively small gains in efficiency, given the option of more rapid development in a higher-level tool.
Lee B
+1  A: 

In addition to other comments, manual memory management makes certain high performance concurrent algorithms more difficult.

Tom Hawtin - tackline
Garbage collection makes certain other high performance concurrent algorithms more difficult, though the performance issues tend to get blamed on the garbage collector which is having to search for garbage that would have been trivially handled using memory management. Religious wars aside, it really is swings and roundabouts.
Steve314
oops - that's "... trivially handled using *manual* memory management" of course.
Steve314
A: 

Another common error is reading or writing memory after you've deallocate it (memory which has since been reallocated and is now being used for something else, or memory which hasn't been realoocated yet and which is therefore currently still own by the heap manager and not by your application).

ChrisW
+2  A: 

Some non-GC languages offer constructs called reference counting smart pointers. These try to get around some problems such forgetting to deallocate memory or trying to access invalid memory by automating some of the management functions.

As some have said, you have to be "smart" about "smart pointers". Smart pointers help to avoid a whole class of problems, but introduce their own class of problems.

Many smart pointers can create memory leaks by:

  • cycles or circular reference (A points to B, B points to A).
  • bugs in the smart pointer implementation (rare on mature libraries like Boost)
  • mixing raw pointers with smart pointers
  • thread safety
  • improperly attached or detaching from a raw pointer

These problems shouldn't be encountered in fully GC'ed environments.

James Schek
so do some GC languages: http://blogs.msdn.com/b/bclteam/archive/2005/03/16/396900.aspx
gbjbaanb
A: 

Please don't compare OO languages (Java, C#) with non OO languages (C) when talking about garbage collection. OO languages (mostly) allow you to implement GC (See comment about smart pointers). Yes they are not easy but they help a lot, and they are deterministic.

Also, how do GC-languages compare to non GC-languages when considering resources other than memory, eg. files, network connections, DB connections, etc...

I think answering that question, left to the reader, will shed some light on things too.

Richard
There's GC for C as well (check out Boehm GC). OO vs. non-OO is not particularly important. Pretty much all functional languages have GC, whether or not it's a functional-OO hybrid. Also, smart pointers are kind of poor man's GC.
Chuck
Specifically, smart pointer schemes don't deal with garbage cycles.
Stephen C
Did forget about functional languages. Thanks for the tip.
Richard
SP != GC. GC generally involves an active process of findng and reclaiming of unused memory. SP is a passive mechanism that extends RAII semantics to the heap. They are both forms of automatic memory management, but garbage collection is distinctly different than smart pointers. The regular "pointers" (i.e. referenes) in Java are not "smart" in any way like C++ smart pointers.
James Schek
A: 

Usually, languages with Garbage Collection restrict the programmer's access to memory, and rely on a memory model where objects contain:

  • reference counters - the GC uses this to know when an object is unused, and
  • type and size information - to eliminate buffer overruns (and help reduce other bugs.

In comparison with a non-GC language, there are two classes of errors that are reduced/eliminated by the model and the restricted access:

  1. Memory Model errors, such as:

    • memory leaks (failure to deallocate when done),
    • freeing memory more than once,
    • freeing memory that was not allocated (like global or stack variables),
  2. Pointer errors, such as:

    • Uninitialized pointer, with "left over" bits from previous use,
    • Accessing, especially writing to, memory after freeing (nasty!)
    • Buffer overrun errors,
    • Use of memory as wrong type (by casting).

There are more, but those are the big ones.

NVRAM
+10  A: 

IMO, garbage collected languages have complementary problems to those in non-garbage-collected languages. For every issue, there is a non-GC-characteristic bug and a GC-characteristic bug - a non-GC programmer responsibility and a GC programmer responsibility.

GC programmers may believe that they are relieved of responsibility for freeing objects, but objects hold resources other than memory - resources that often need to be released in a timely way so that they can be acquired elsewhere - e.g. file handles, record locks, mutexes...

Where a non-GC programmer would have a dangling reference (and very often one that isn't a bug, since some flag or other state would mark it as not to be used), a GC programmer has a memory leak. Thus where the non-GC programmer is responsible for ensuring that free/delete is called appropriately, a GC programmer is responsible for ensuring that unwanted references are nulled or otherwise disposed of appropriately.

There is a claim in here that smart pointers don't deal with garbage cycles. This need not be true - there are reference counting schemes that can break cycles and which also ensure timely disposal of garbage memory, and at least one Java implementation used (and may still do) a reference counting scheme that could just as easily be implemented as a smart pointer scheme in C++.

Concurrent Cycle Collection in Reference Counted Systems

Of course this isn't normally done - partly because you may as well just use a GC language, but also partly IMO because it would break key conventions in C++. You see, lots of C++ code - including the standard library - relies heavily on the Resource Allocation Is Initialisation (RAII) convention, and that relies on reliable and timely destructor calls. In any GC that copes with cycles, you simply cannot have that. When breaking a garbage cycle, you cannot know which destructor to call first without any dependency issues - it may not even be possible, since there may be more cyclic dependencies than just memory references. The solution - in Java etc, there is no guarantee that finalizers will be called. Garbage collection only collects one very specific kind of garbage - memory. All other resources must be cleaned up manually, as they would have been in Pascal or C, and without the advantage of reliable C++-style destructors.

End result - a lot of cleanup that gets "automated" in C++ has to be done manually in Java, C# etc. Of course "automated" needs the quotes because the programmer is responsible for ensuring that delete is called appropriately for any heap-allocated objects - but then in GC languages, there are different but complementary programmer responsibilities. Either way, if the programmer fails to handle those responsibilities correctly, you get bugs.

Frankly, switching from non-GC to GC (or visa versa) is no magic wand. It may make the usual suspect problems go away, but that just means you need new skillsets to prevent (and debug) an whole new set of suspects.

A good programmer should get past the whos-side-are-you-on BS and learn to handle both.

Steve314