views:

319

answers:

4

Is there any substantial optimization when omitting the frame pointer? If I have understood correctly by reading this page, -fomit-frame-pointer is used when we want to avoid saving, setting up and restoring frame pointers.

Is this done only for each function call and if so, is it really worth to avoid a few instructions for every function? Isn't it trivial for an optimization. What are the actual implications of using this option apart from the debugging limitations?

I compiled the following C code with and without this option

int main(void)
{
        int i;

        i = myf(1, 2);
}

int myf(int a, int b)
{
        return a + b;
}

,

# gcc -S -fomit-frame-pointer code.c -o withoutfp.s
# gcc -S code.c -o withfp.s

.

diff -u 'ing the two files revealed the following assembly code:


--- withfp.s    2009-12-22 00:03:59.000000000 +0000
+++ withoutfp.s 2009-12-22 00:04:17.000000000 +0000
@@ -7,17 +7,14 @@
        leal    4(%esp), %ecx
        andl    $-16, %esp
        pushl   -4(%ecx)
-       pushl   %ebp
-       movl    %esp, %ebp
        pushl   %ecx
-       subl    $36, %esp
+       subl    $24, %esp
        movl    $2, 4(%esp)
        movl    $1, (%esp)
        call    myf
-       movl    %eax, -8(%ebp)
-       addl    $36, %esp
+       movl    %eax, 20(%esp)
+       addl    $24, %esp
        popl    %ecx
-       popl    %ebp
        leal    -4(%ecx), %esp
        ret
        .size   main, .-main
@@ -25,11 +22,8 @@
 .globl myf
        .type   myf, @function
 myf:
-       pushl   %ebp
-       movl    %esp, %ebp
-       movl    12(%ebp), %eax
-       addl    8(%ebp), %eax
-       popl    %ebp
+       movl    8(%esp), %eax
+       addl    4(%esp), %eax
        ret
        .size   myf, .-myf
        .ident  "GCC: (GNU) 4.2.1 20070719 

Could someone please shed light on the key points of the above code where -fomit-frame-pointer did actually make the difference?

Edit: objdump's output replaced with gcc -S's

+7  A: 

-fomit-frame-pointer allows one extra register to be available for general-purpose use. I would assume this is really only a big deal on 32-bit x86, which is a bit starved for registers.*

One would expect to see EBP no longer saved and adjusted on every function call, and probably some additional use of EBP in normal code, and fewer stack operations on occasions where EBP gets used as a general-purpose register.

Your code is far too simple to see any benefit from this sort of optimization-- you're not using enough registers. Also, you haven't turned on the optimizer, which might be necessary to see some of these effects.

* ISA registers, not micro-architecture registers.

Eric Seppanen
If I have to set explicitly other optimization options, what's the meaning of this option being separate? Your point that my code is simple seems valid though!
Petros
This option is separate because it significant downsides for debugging.
Anon.
It's separate because it has functional implications for other things, like running your code in a debugger, or linking with other code. I assume you'd see a reduction in register spills even with the optimizer turned off, but since I don't know for sure I'm hedging my bets.
Eric Seppanen
+2  A: 

The only downside of omitting it is that debugging is much more difficult.

The major upside is that there is one extra general purpose register which can make a big difference on performance. Obviously this extra register is used only when needed (probably in your very simple function it isn't); in some functions it makes more difference than in others.

Andreas Bonini
Not only does it make debugging much more dufficult. Gnu docsonline says that it makes debugging impossible
Petros
They are wrong. `printf()` debugging (which **IS** still debugging) is very possible, for example.
Andreas Bonini
You can still debug at the instruction (assembly language) level regardless of any compiler options used. Not as easy as source level debugging to be sure, but "impossible" is definitely the wrong word.
Ben Voigt
+1  A: 

Profile your program to see if there is a significant difference.

Next, profile your development process. Is debugging easier or more difficult? Do you spend more time developing or less?

Optimizations without profiling are a waste of time and money.

Thomas Matthews
+1  A: 

You can often get more meaningful assembly code from GCC by using the -S argument to output the assembly:

$ gcc code.c -S -o withfp.s
$ gcc code.c -S -o withoutfp.s -fomit-frame-pointer
$ diff -u withfp.s withoutfp.s

GCC doesn't care about the address, so we can compare the actual instructions generated directly. For your leaf function, this gives:

 myf:
-       pushl   %ebp
-       movl    %esp, %ebp
-       movl    12(%ebp), %eax
-       addl    8(%ebp), %eax
-       popl    %ebp
+       movl    8(%esp), %eax
+       addl    4(%esp), %eax
    ret

GCC doesn't generate the code to push the frame pointer onto the stack, and this changes the relative address of the arguments passed to the function on the stack.

Commodore Jaeger