views:

86

answers:

5

What exactly happens, in terms of memory, when i declare something like: char arr[4];

How many bytes are reserved for arr?

How is null string accommodated when I 'strcpy' a string of length 4 in arr?

I was writing a socket program, and when I tried to suffix NULL at arr[4] (i.e. the 5th memory location), I ended up replacing the values of some other variables of the program (overflow) and got into a big time mess.

Any descriptions of how compilers (gcc is what I used) manage memory?

+2  A: 

sizeof(arr) bytes are saved* (plus any padding the compiler wants to put around it, though that isn't for the array per se). On an implementation with a stack, this just means moving the stack pointer sizeof(arr) bytes down. (That's where the storage comes from. This is also why automatic allocation is fast.)

'\0' isn't accommodated. If you copy "abcd" into it, you get a buffer overrun, because that takes up 5 bytes total, but you only have 4. You enter undefined behavior land, and anything could happen.

In practice you'll corrupt the stack and crash sooner or later, or experience what you did and overwrite nearby variables (because they too are allocated just like the array was.) But nobody can say for certain what happens, because it's undefined.

* Which is sizeof(char) * 4. sizeof(char) is always 1, so 4 bytes.

GMan
+1 for explicitly pointing out that sizeof(char) always equals 1.
Jerry Coffin
+1  A: 

In C, what you ask for is--usually--exactly what you get. char arr[4] is exactly 4 bytes.

But anything in quotes has a 'hidden' null added at the end, so char arr[] = "oops"; reserves 5 bytes.

Thus, if you do this:

char arr[4];
strcpy(arr, "oops");

...you will copy 5 bytes (o o p s \0) when you've only reserved space for 4. Whatever happens next is unpredictable and often catastrophic.

egrunin
Note that while `char arr[] = "oops";` reserves the extra byte for the trailing NUL, it's possible to suppress this by using `char arr[4] = "oops";` instead. But of course it won't work as a C string without the NUL… =)
Arkku
Minor nit: you should use `NUL` or `'\0'` to refer to the terminator, not `NULL` (the null pointer constant).
jamesdlin
@Arkku: I don't know what the compiler will do with that (and I suspect the behavior is undefined). @jamesdlin: fixed.
egrunin
@egrunin: `char arr[4] = "oops";` is defined in standard C; it's equivalent to `char arr[4] = {'o','o','p','s'};`. See [the C FAQ](http://c-faq.com/ansi/nonstrings.html).
Arkku
@Arkku: thanks.
egrunin
A: 

When you define a variable like char arr[4], it reserves exactly 4 bytes for that variable. As you've found, writing beyond that point causes what the standard calls "undefined behavior" -- a euphemism for "you screwed up -- don't do that."

The memory management of something like this is pretty simple: if it's a global, it gets allocated in a global memory space. If it's a local, it gets allocated on the stack by subtracting an appropriate amount from the stack pointer. When you return, the stack pointer is restored, so they cease to exist (and when you call another function, will normally get overwritten by parameters and locals for that function).

Jerry Coffin
+1  A: 

What exactly happens, in terms of memory, when i declare something like: char arr[4];

4 * sizeof(char) bytes of stack memory is reserved for the string.

How is null string accommodated when I 'strcpy' a string of length 4 in arr?

You can not. You can only have 3 characters, 4th one (i.e. arr[3]) should be '\0' character for a proper string.

when I tried to suffix NULL at arr[4]

The behavior will be undefined as you are accessing a invalid memory location. In the best case, your program will crash immediately, but it might corrupt the stack and crash at a later point of time also.

Naveen
A: 

When you make a declaration like char arr[4];, the compiler allocates as many bytes as you asked for, namely four. The compiler might allocate extra in order to accommodate efficient memory accesses, but as a rule you get exactly what you asked for.

If you then declare another variable in the same function, that variable will generally follow arr in memory, unless the compiler makes certain optimizations again. For that reason, if you try to write to arr but write more characters than were actually allocated for arr, then you can overwrite other variables on the stack.

This is not really a function of gcc. All C compilers work essentially the same way.

JSBangs