views:

522

answers:

5

I apologize if the answer to this question is trivial. But I still cannot figure out this by myself.

How does the garbage collector in .NET identify what objects on the heap are garbage and what objects are not?

Lets say a .NET application is running and at a certain point of time garbage collection occurs(lets leave out the generations and finalization queue for simplicity sake).

Now the application may have:

  1. stack variables pointing to objects on heap.
  2. registers containing addresses of objects on heap.
  3. Static variables pointing to objects on heap.

This is how I ASSUME the GC works.

  1. It de-references each such address and ends up at the object on the heap.
  2. It marks the object as not garbage (by using the sync block index) since some variable is still pointing to it.
  3. It does this operation for all the addresses(referred to as roots for some reason in most articles)
  4. Now since the .NET runtime has information about the TYPE of each object, it can calculate the size of each object and hence the block of heap memory it occupies. For all the marked objects, it leaves the block of memory occupied as it is.
  5. The rest of the memory is freed, compacted and the if necessary the other objects are relocated(and their addresses updated).

Am I correct in my understanding?

+2  A: 

Read this article

C# Basic Concepts—Automatic Memory Management

rahul
A: 

To tell the truth, I know nothing about the specific garbage collector .NET uses, or about how the modern efficient garbage collectors work. However, I do know about one type of garbage collector, that the JVM used at one point.

In the JVM, each object has a header, or a little bit of information attached to it at the beginning. Included in that information is a "reference count". So when you store that object to a field or somehow have a reference to it, it increments that counter. Every time you step out of a frame with X references, you decrease the reference counter by X. Basically, if you do the work when you add and remove references, then it is easy to check if an object is needed. Now, basically, the garbage collector has a list of all the objects that were ever allocated. If any object has a reference count of 0, BAM, you know that it is not being used anywhere.

Now, that sounds pretty inefficient, and I bet it is, but they have more efficient variations on this and other methods that they use now. The truth be told, the modern high performance garbage collectors are extremely complex.

Jeremybub
I want to point out that the .net GC doesn't works with reference counting.
Stormenet
-1 That may be how the JVM GC works, but the OP is asking about the .net GC. How is this helpful?
Simon P Stevens
The question wasn't originally tagged .net. It seemed he was just using .NET as an example of a garbage collector.
Jeremybub
Fredrik
+2  A: 

Here's a couple of useful articles:

LukeH
A: 

You are right in some cases. The GC looks through the heap pessimistically - i.e. it sets off assuming everything (in Generation 0) will be GCed.

It literally goes through everything on the heap through a first sweep called "marking", in which is checks if anything is referencing it. Since they are all reference types and some reference others, it will recursively navigate the references. Don't worry - there is logic to not get into an infinite loop!

If it finds an object is not referenced, it will firstly mark it, by setting a flag within the object called the sync block index.

After going through every object on the heap, it will then begin a process called "compacting" which is when it shifts all of the remaining objects into the same area of memory, leaving the memory above clear. It will keep the objects of the same generation together as they are statistically more likely to be de-referenced at the same time.

This therefore will reduce the memory needed.

Garbage Collection doesn't necessarily speed up your program, but does allow it to re-use the space occupied by unused objects.

There are many many articles on the subject. I personally like "CLR via C#" by Jeffrey Richter who gives an excellent chapter on how it works.

Dominic Zukiewicz
A: 

I'm currently reading this book to help with an independent study project in Garbage Collection at my university. If you really want to understand the ins-and-outs of Garbage Collection, I suggest reading this book because it seems to be the best one around. This most likely contains more information than what you're looking for, but it may be helpful if you want to write a Garbage Collector in the future.

Anthony Cuozzo