Note: This answer is way too long. I'll pare it down sometime. Meanwhile, comment if you can think of useful edits.
To answer your questions, we first need to define two areas of memory called the stack and the heap.
The stack
Imagine the stack as a stack of boxes. Each box represents the execution of a function. At the beginning, when main
is called, there is one box sitting on the floor. Any local variables you define are in that box.
A simple example
int main(int argc, char * argv[])
{
int a = 3;
int b = 4;
return a + b;
}
In this case, you have one box on the floor with the variables argc
(an integer), argv
(a pointer to a char array), a
(an integer), and b
(an integer).
More than one box
int main(int argc, char * argv[])
{
int a = 3;
int b = 4;
return do_stuff(a, b);
}
int do_stuff(int a, int b)
{
int c = a + b;
c++;
return c;
}
Now, you have a box on the floor (for main
) with argc
, argv
, a
, and b
. On top of that box, you have another box (for do_stuff
) with a
, b
, and c
.
This example illustrates two interesting effects.
As you probably know, a
and b
were passed-by-value. That's why there is a copy of those variables in the box for do_stuff
.
Notice that you don't have to free
or delete
or anything for these variables. When your function returns, the box for that function is destroyed.
Box overflow
int main(int argc, char * argv[])
{
int a = 3;
int b = 4;
return do_stuff(a, b);
}
int do_stuff(int a, int b)
{
return do_stuff(a, b);
}
Here, you have a box on the floor (for main
, as before). Then, you have a box (for do_stuff
) with a
and b
. Then, you have another box (for do_stuff
calling itself), again with a
and b
. And then another. And soon, you have a stack overflow.
Summary of the stack
Think of the stack as a stack of boxes. Each box represents a function executing, and that box contains the local variables defined in that function. When the function returns, that box is destroyed.
More technical stuff
- Each "box" is officially called a stack frame.
- Ever notice how your variables have "random" default values? When an old stack frame is "destroyed", it just stops being relevant. It doesn't get zeroed out or anything like that. The next time a stack frame uses that section of memory, you see bits of old stack frame in your local variables.
The heap
This is where dynamic memory allocation comes into play.
Imagine the heap as an endless green meadow of memory. When you call malloc
or new
, a block of memory is allocated in the heap. You are given a pointer to access this block of memory.
int main(int argc, char * argv[])
{
int * a = new int;
return *a;
}
Here, a new integer's worth of memory is allocated on the heap. You get a pointer named a
that points to that memory.
a
is a local variable, and so it is in main
's "box"
Rationale for dynamic memory allocation
Sure, using dynamically allocated memory seems to waste a few bytes here and there for pointers. However, there are things that you just can't (easily) do without dynamic memory allocation.
Returning an array
int main(int argc, char * argv[])
{
int * intarray = create_array();
return intarray[0];
}
int * create_array()
{
int intarray[5];
intarray[0] = 0;
return intarray;
}
What happens here? You "return an array" in create_array
. In actuality, you return a pointer, which just points to the part of the create_array
"box" that contains the array. What happens when create_array
returns? Its box is destroyed, and you can expect your array to become corrupt at any moment.
Instead, use dynamically allocated memory.
int main(int argc, char * argv[])
{
int * intarray = create_array();
int return_value = intarray[0];
delete[] intarray;
return return_value;
}
int * create_array()
{
int * intarray = new int[5];
intarray[0] = 0;
return intarray;
}
Because function returning does not modify the heap, your precious intarray
escapes unscathed. Remember to delete[]
it after you're done though.