views:

544

answers:

5

In C++, local variables are always allocated on the stack. The stack is a part of the allowed memory that your application can occupy. That memory is kept in your RAM (if not swapped out to disk). Now, does a C++ compiler always create assembler code that stores local variables on the stack?

Take, for example, the following simple code:

int foo( int n ) {
   return ++n;
}

In MIPS assembler code, this could look like this:

foo:
addi $v0, $a0, 1
jr $ra

As you can see, I didn't need to use the stack at all for n. Would the C++ compiler recognize that, and directly use the CPU's registers?

Edit: Wow, thanks a lot for your almost immediate and extensive answers! The function body of foo should of course be return ++n;, not return n++;. :)

+9  A: 

Yes. There is no rule that "variables are always allocated on the stack". The C++ standard says nothing about a stack.It doesn't assume that a stack exists, or that registers exist. It just says how the code should behave, not how it should be implemented.

The compiler only stores variables on the stack when it has to - when they have to live past a function call for example, or if you try to take the address of them.

The compiler isn't stupid. ;)

jalf
+4  A: 

Yes, a good, optimizing C/C++ will optimize that. And even MUCH more: See here: Felix von Leitners Compiler Survey.

A normal C/C++ compiler will not put every variable on the stack anyway. The problem with your foo() function could be that the variable could get passed via the stack to the function (the ABI of your system (hardware/OS) defines that).

With C's register keyword you can give the compiler a hint that it would probably be good to store a variable in a register. Sample:

register int x = 10;

But remember: The compiler is free not to store x in a register if it wants to!

Johannes Weiß
+1 for the great link
fnieto
+6  A: 

Disclaimer: I don't know MIPS, but I do know some x86, and I think the principle should be the same..

In the usual function call convention, the compiler will push the value of n onto the stack to pass it to the function foo. However, there is the fastcall convention that you can use to tell gcc to pass the value through the registers instead. (MSVC also has this option, but I'm not sure what its syntax is.)

test.cpp:

int foo1 (int n) { return ++n; }
int foo2 (int n) __attribute__((fastcall));
int foo2 (int n) {
    return ++n;
}

Compiling the above with g++ -O3 -fomit-frame-pointer -c test.cpp, I get for foo1:

mov eax,DWORD PTR [esp+0x4]
add eax,0x1
ret

As you can see, it reads in the value from the stack.

And here's foo2:

lea eax,[ecx+0x1]
ret

Now it takes the value directly from the register.

Of course, if you inline the function the compiler will do a simple addition in the body of your larger function, regardless of the calling convention you specify. But when you can't get it inlined, this is going to happen.

Disclaimer 2: I am not saying that you should continually second-guess the compiler. It probably isn't practical and necessary in most cases. But don't assume it produces perfect code.

Edit 1: If you are talking about plain local variables (not function arguments), then yes, the compiler will allocate them in the registers or on the stack as it sees fit.

Edit 2: It appears that calling convention is architecture-specific, and MIPS will pass the first four arguments on the stack, as Richard Pennington has stated in his answer. So in your case you don't have to specify the extra attribute (which is in fact an x86-specific attribute.)

int3
-O disables stack frame setup on machines where it doesn't interfere with debugging - x86 isn't one of them, you need a seperate -fomit-frame-pointer to eliminate 'redundant' stack frame setup (which is actually useful for debugging, i.e. in stack frame unwinding)
matja
yeah, I totally forgot about that. I'll fix it. But the difference remains.
int3
A compiler that does link time optimizations can also recognize that a call can be turned into a fast call all on its own because it can see and fix all the call sites.
Richard Pennington
Which "difference remains"? The one matja pointed out was the *only* ineffiicency, wasn't it? And that was caused by a missing optimization flag. -1 for saying you can't assume the compiler will store intermediates in registers. You're right with more complex optimizations, of course, but for that one?
jalf
The difference would be that the compiler would do the equivalent of a fast call without the programmer having to use the non-standard __attribute__.
Richard Pennington
@jalf: did you read the updated post? I added the optimization flag quite a while before you posted that comment. There's obviously still one more instruction in the non-fastcall version. @Richard: I linked it together and it still produced a non-fast-call.
int3
@Richard: admittedly my compiler doesn't have the new gcc link time optimizer.. I wonder if that will make a difference. But under more 'common' compilation options, there's a definitely a difference.
int3
Yes, there is one instruction more. That is because it has to use the same calling convention as the caller. Otherwise actually *calling* the function is impossible. That extra instruction is obviously removed if the function is inlined (which it will typically be). But honestly, I think you've gotten way sidetracked. Read the actual question. It was not "does my compiler ignore calling conventions in order to generate optimal code". It was "is my compiler able to store local variables in registers". And a function parameter isn't exactly a local variable, which is why it's not in a register
jalf
Non-local variables, variables that have to be accessible to other functions, have to follow additional constraints, such as "following the right calling convention". That is not "inefficiency", and it is not a lack of optimization. It is creating a function that *works*.
jalf
@jalf: I'm not sure what you mean by 'creating a function that works'; surely a `fastcall` function works as well, so long as we tell the caller what convention it uses? But yes, now that you've pointed it out, I realize that the OP's question in his examples differed from the question he stated in words, and I should have addressed both issues.
int3
@jalf: I think we have been arguing over a different idea of the question at hand. Sorry for the confusion.
int3
+5  A: 

The answer is yes, maybe. It depends on the compiler, the optimization level, and the target processor.

In the case of the mips, the first four parameters, if small, are passed in registers and the return value is returned in a register. So your example has no requirement to allocate anything on the stack.

Actually, truth is stranger than fiction. In your case the parameter is returned unchanged: the value returned is that of n before the ++ operator:

foo:
    .frame  $sp,0,$ra
    .mask   0x00000000,0
    .fmask  0x00000000,0

    addu    $2, $zero, $4
    jr      $ra
    nop
Richard Pennington
+1  A: 

Since your example foo function is an identity function (it just returns it's argument), my C++ compiler (VS 2008) completely removes this function call. If I change it to:

int foo( int n ) {
   return ++n;
}

the compiler inlines this with

lea edx, [eax+1]
Andreas Brinck
Yes, again with a mips example: static int foo( int n ) { return n++; } int fee() { return foo(5); }gives: .text .align 2 .globl fee .ent fee fee: .frame $sp,0,$ra .mask 0x00000000,0 .fmask 0x00000000,0 addiu $2, $zero, 5 jr $ra nop .set macro .set reorder .end fee .size fee, .-fee
Richard Pennington