views:

1123

answers:

5

I'm very curious of the stack memory organization after I experiment what's going on in the background and obviously saw it's matching with tiny knowledge I acquired from books. Just wanted to check if what I've understood is correct.

I have a fundamental program -- has 2 functions, first one is foo and the other is main (the entry point).

void foo(){
    // do something here or dont
}

int main(){

    int i = 0;

    printf("%p %p %p\n",foo, &i, main);

    system("PAUSE");
    return EXIT_SUCCESS;
};

The output of the program is shown below, main's local variable i is located totally in a unrelated position. integer is a value type but checked it again with a char * pointer local to main and obtain similar results.

00401390 0022FF44 00401396
Press any key to continue . . .

I mainly understand that code and variables are allocated into different segments of memory (code segment/data segment). So basically is it right to say call stack collapses basic information about the execution of functions (their local variables, parameters, returning points) and keep them in the data segment?

+1  A: 

Yes, that's exactly right. Code and data live in different parts of memory, with different permissions. The stack holds parameters, return addresses and local ("automatic") variables, and lives with the data.

RichieHindle
So the data segment we are always talking about is only interested in the runtime stack. It's been a while I havent messed with unmanaged worlds, thanks for clarifying.
Burcu Dogan
A: 

Yes.

Imagine that your code memory is ROM, and your data memory is RAM(a common small chip architecture). Then you see the stack must be in data memory.

Paul Nathan
This is somehow weird: going back and forth between different segments to obtain a returning point for a procedure. Is this the most performance efficient architecture?
Burcu Dogan
Going back and forth? The return address is on the stack, as well as all code execution. The text segment is never read directly for execution except at the very initial instruction which points to main.
Zombies
+3  A: 

A little caveat at the start: all of these answers are somewhat affected by the operating system and hardware architecture. Windows does things fairly radically differently from UNIX-like languages, real-time operating systems and old small-system UNIX.

But the basic answer as @Richie and @Paul have said, is "yes." When your compiler and linker get through with the code, it's broken up into what are known as "text" and "data" segments in UNIX. A text segment contains instructions and some kinds of static data; a data segment contains, well, data.

A big chunk of the data segment is then allocated for stack and heap space. Other chunks can be allocated to things like static or extern data structures.

So yes, when the program runs, the program counter is busily fetching instructions from a different segment than the data. Now we get into some architecture dependencies, but in general if you have segmented memory your instructions are constructed in such a way that fetching a byte from the segments is as efficient as possible, In the old 360 architecture, they had base registers, in x86 have a bunch of hair that grew as the address space went to the old 8080's to modern processors, but all of the instructions are very carefully optimized because, as you can imagine, fetching instructions and their operands are very intensively used.

Now we et to more modern architectures with virtual memory and memory management units. Now the machine has specific hardware that let's the program treat the address space as a big flat range of addresses; the various segments simply get placed in that bit virtual address space. The MMU's job is to take a virtual address and translate it to a physical address, including what to do if that virtual address doesn't happen to be in physical memory at all at the moment. Again, the MMU hardware is very heavily optimized, but that doesn't mean there is no performance cost associated. But as processors have gotten faster and programs have goten bigger, it's become less and less important.

Charlie Martin
+1  A: 

Your program exhibits undefined behavior specifically because:

  • you fail to include <stdio.h> or <cstdio> depending on the language you are compiling your code as
  • printf and all variable argument functions do not have the capacity to type-check their arguments. Hence it is obligatory on your part to pass correctly typed arguments. You really should do:
  • system() has no declaration in scope. Include <stdlib.h> or <cstdlib> as the case maybe.

Write your code as:

   #include <stdio.h>

   int main() {
      /* ... */
      printf("%p %p %p\n", (void *)foo, (void *)&i, (void *)main);
      /* ... */
   }

Also note that:

  • The definition of void foo() is not a prototype in C, but in C++. However, if you were to write void foo(void) you'd get a prototype in both languages.
  • system() is implementation dependent -- your code may not behave as expected across platforms.

The language proper(C or C++) does not put any restrictions on how to organize memory. It does not even have the concept of a stack or a heap. These are defined by implementations as they deem fit. You should ideally consult documentation provided by your implementation to get a fair idea of what they do.

dirkgently
They are added on the real implementation, your answer is totally off-topic.
Burcu Dogan
Which part? Do you realize that it makes no sense to discuss code that is broken?
dirkgently
@Dirk: What prototype of `foo`? `foo` is not only declared in the code above, it's *defined*. No need for a prototype. As for casting the pointers to `void*` explicitly – that's completely unnecessary here (although arguably more readable).
Konrad Rudolph
@Konrad: I think I've explained my points rather clearly. The definition of foo does not provide a prototype i.e. a complete declaration in scope. Note the empty parameter list. Also, I the casts are required since varargs functions do not do (except for gcc, in some cases) type checking of parameters and you are expected to pass in correctly typed arguments. The point of my post is: There is no one correct answer to what the OP is asking as far as the language is concerned. If the OP provided more details about his/her implementation, a proper answer can follow.
dirkgently
A: 

Well, I can speak for SPARC:

Yes. When you run the program, the program is read twice (at least in SPARC). The program gets loaded into memory, and any array/stack allocations get loaded afterwards. In the second pass through the program, the stacks get allocated into separate memory.

I am not sure for CISC based processors, but I suspect it doesn't vary too much.

Andrew Sledge