views:

227

answers:

5
+2  Q: 

Garbage Collection

Hi All,

I am not able to understand few things on the Garbage collection.

Firstly, how is data allocated space ? i.e. on stack or heap( As per my knowledge, all static or global variables are assigned space on stack and local variables are assigned space on heap).

Second, GC runs on data on stacks or heaps ? i.e a GC algorithm like Mark/Sweep would refer to data on stack as root set right? And then map all the reachable variables on heap by checking which variables on heap refer to the root set.

What if a program does not have a global variable? How does the algorithm work then?

Regards, darkie

+6  A: 

It might help to clarify what platform's GC you are asking about - JVM, CLR, Lisp, etc. That said:

First to take a step back, certain local variables of are generally allocated on the stack. The specifics can vary by language, however. To take C# as an example, only local Value Types and method parameters are stored on the stack. So, in C#, foo would be allocated on the stack:

public function bar() { 
    int foo = 2;
    ...
}

Alternatively, dynamically-allocated variables use memory from the heap. This should intuitively make sense, as otherwise the stack would have to grow dynamically each time a new is called. Also, it would mean that such variables could only be used as locals within the local function that allocated them, which is of course not true because we can have (for example) class member variables. So to take another example from C#, in the following case result is allocated on the heap:

public class MyInt
{         
    public int MyValue;
}

...
MyInt result = new MyInt();
result.MyValue = foo + 40;
...

Now with that background in mind, memory on the heap is garbage-collected. Memory on the stack has no need for GC as the memory will be reclaimed when the current function returns. At a high level, a GC algorithm works by keeping track of all objects that are dynamically allocated on the heap. Once allocated via new, the object will be tracked by GC, and collected when it is no longer in scope and there are no more references to it.

Justin Ethier
Thank you very much. This makes things a lot easier to understand
darkie15
"To take C#, only Value Types are stored on the stack." - this is not strictly correct. Quoting "C# In Depth", page 52: "...a variable's value lives where it's declared - so if you have a a class with an instance variable of type int, this variable's value for any given object will always be where the rest of the data for the object is - on the heap...Only local variables and method parameters live on the stack..."
duffymo
Right, I meant that only local variables that are Value Types will be stored on the stack. I'll update the answer to make that more clear...
Justin Ethier
@Justin:When you said: "This should intuitively make sense, as otherwise the stack would have to grow dynamically each time a new was called. Also, it would mean that such variables could only be used within the local function that allocated them, which is of course not true"... did you mean this only for C#?Here is the piece of code written in Java :public class StringCheck{ public static void main(String[] args) { callingOne(); callingTwo(); } public static void callingOne() { Integer temp = new Integer("1"); }
darkie15
public static void callingTwo() { System.out.println("Accessing temp in callingTwo: "+ temp); }}Doesnt seem to work. Getting an error : StringCheck.java:18: cannot find symbolsymbol : variable templocation: class StringCheck System.out.println("Accessing temp in callingTwo: "+ temp);
darkie15
No, but it depends on the context. In your example, temp is a local variable so it is only accessible from callingOne. To make it accessible in both places you would either need to make it a class member variable (and use non-static methods) or pass it as a parameter to callingTwo.
Justin Ethier
+2  A: 

Check out the book Garbage Collection: algorithms for automatic dynamic memory management.

lhf
Well I cannot access the contents of the book. Thanks anyways
darkie15
+2  A: 

Firstly, how is data allocated space ? i.e. on stack or heap( As per my knowledge, all static or global variables are assigned space on stack and local variables are assigned space on heap).

No, stack variables are method calls and local variables. A stack frame is created when the method is called and popped off when it returns.

Memory in Java and C# is allocated on the heap by calling "new".

Second, GC runs on data on stacks or heaps ? i.e a GC algorithm like Mark/Sweep would refer to data on stack as root set right? And then map all the reachable variables on heap by checking which variables on heap refer to the root set.

GC is used on the heap.

Mark and sweep would not be considered a cutting edge GC algorithm. Both Java and .NET GC use generational models now.

What if a program does not have a global variable? How does the algorithm work then?

What does "global variable" mean in languages like Java and C# where everything belongs to a class?

The root of the object graph is arbitrary. I'll admit that I don't know how it's chosen.

duffymo
Thank you so much .. that explains a lot to me.
darkie15
+1  A: 

Read this article. It is a very good survey on uniprocessor garbage collection techniques. It will give you the basic understanding and terminology on GC. Then, follow up with the Jones and Lins book "Garbage Collection: Algorithms for Automatic Dynamic Memory Management". Contrary to the survey article I point to above, the book is not available for free on the Web; you have to buy it; but it is worth it.

Thomas Pornin
A: 

Richard and Carl have a very nice show on the Windows Memory Model, including the .NET model and GC, in their .NET Rocks! archives:

Jeffrey Richter on the Windows Memory Model

Rune