tags:

views:

75

answers:

2
 $ gcc -O2 -S test.c -----------------------(1)
      .file "test.c"
    .globl accum
       .bss
       .align 4
       .type accum, @object
       .size accum, 4
    accum:
       .zero 4
       .text
       .p2align 2,,3
    .globl sum
       .type sum, @function
    sum:
       pushl %ebp
       movl  %esp, %ebp
       movl  12(%ebp), %eax
       addl  8(%ebp), %eax
       addl  %eax, accum
       leave
       ret
       .size sum, .-sum
       .p2align 2,,3
    .globl main
       .type main, @function
    main:
       pushl %ebp
       movl  %esp, %ebp
       subl  $8, %esp
       andl  $-16, %esp
       subl  $16, %esp
       pushl $11
       pushl $10
       call  sum
       xorl  %eax, %eax
       leave
       ret
       .size main, .-main
       .section .note.GNU-stack,"",@progbits
       .ident   "GCC: (GNU) 3.4.6 20060404 (Red Hat 3.4.6-9)"

This is an assembly code generated from this C program:

#include <stdio.h>
int accum = 0;

int sum(int x,int y)
{
   int t = x+y;
   accum +=t;
   return t;
}

int main(int argc,char *argv[])
{
   int i = 0,x=10,y=11;
   i = sum(x,y);
   return 0;
}

Also, this is the object code generated from the above program:

$objdump -d test.o -------------------------(2) 

test.o:     file format elf32-i386

Disassembly of section .text:

00000000 <sum>:
   0:   55                      push   %ebp
   1:   89 e5                   mov    %esp,%ebp
   3:   8b 45 0c                mov    0xc(%ebp),%eax
   6:   03 45 08                add    0x8(%ebp),%eax
   9:   01 05 00 00 00 00       add    %eax,0x0
   f:   c9                      leave
  10:   c3                      ret
  11:   8d 76 00                lea    0x0(%esi),%esi

00000014 <main>:
  14:   55                      push   %ebp
  15:   89 e5                   mov    %esp,%ebp
  17:   83 ec 08                sub    $0x8,%esp
  1a:   83 e4 f0                and    $0xfffffff0,%esp
  1d:   83 ec 10                sub    $0x10,%esp
  20:   6a 0b                   push   $0xb
  22:   6a 0a                   push   $0xa
  24:   e8 fc ff ff ff          call   25 <main+0x11>
  29:   31 c0                   xor    %eax,%eax
  2b:   c9                      leave
  2c:   c3                      ret

Ideally , listing (1) and (2) must be the same. But I see that there is movl, pushl etc in listing (1) whereas mov,push in lising (2). My question is:

  1. Which is the correct assembly instruction actually executed on the processor?
  2. In listing (1), I see this in the beginning:

.file "test.c"
    .globl accum
       .bss
       .align 4
       .type accum, @object
       .size accum, 4
    accum:
       .zero 4
       .text
       .p2align 2,,3
    .globl sum
       .type sum, @function 

and this at end:

.size main, .-main
           .section .note.GNU-stack,"",@progbits
           .ident   "GCC: (GNU) 3.4.6 20060404 (Red Hat 3.4.6-9)"

What does this mean?

Thanks.

+4  A: 

The instruction is called MOV whatever variant is being used. The l suffix is just a gcc / AT&T assembly convention to specify the size of operands desired, in this case 4 byte operands.

In Intel syntax - where there is any ambiguity - instead of suffixing the instruction it is usual to tag the memory parameter with an indicator of the size required (e.g. BYTE, WORD, DWORD, etc.), it's just another way of achieving the same thing.

89 55 is the correct sequence of bytes for MOV from the 32-bit register EBP to the 32-bit register ESP. There is nothing wrong in either listing.


Specifies the file that this assembly code was generated from:

.file "test.c"

Says that accum is a global symbol (C variable with external linkage):

    .globl accum

The following bytes should be placed in a bss section, this is a section that takes no space in the object file but is allocated and zeroed at runtime.

       .bss

Aligned on a 4 byte boundary:

       .align 4

It's an object (a variable, not some code):

       .type accum, @object

It's four bytes:

       .size accum, 4

Here is where accum is defined, four zero bytes.

    accum:
       .zero 4

Now switch from the bss section to the text section which is where functions are usually stored.

       .text

Add up to three bytes of padding to make sure we are on a 4 byte (2^2) boundary:

       .p2align 2,,3

sum is a global symbol and it's a function.

    .globl sum
       .type sum, @function 

The size of main is "here" - "where main started":

.size main, .-main

These where gcc specific stack options are specified. Usually, this is where you choose to have an executable stack (not very safe) or not (usually preferred).

       .section .note.GNU-stack,"",@progbits

Identify which version of the compiler generated this assembly:

       .ident   "GCC: (GNU) 3.4.6 20060404 (Red Hat 3.4.6-9)"
Charles Bailey
A: 

The assembler listing and the disassembler listing show the same code, but use a different syntax. The appended -l is the syntax variant used by gcc. That you have a different syntax in the tools (C-compiler output and disassembler) shows a weakness of your toolchain.

The disassemnbly at offset 11 in sum: shows just some garbage bytes. The entry point to the next function main is 4-byte aligned what gives this gap, fill with garbage.

The bunch of .statements are defined by the documentation of the assembler. Usually they dont give any executable code.

harper