views:

57

answers:

5

I know I can get the assembler source code generated by the compiler by using:

gcc -S ...

even though that annoyingly doesn't give me an object file as part of the process.

But how can I get everything about the compiled code? I mean addresses, the bytes generated and so forth.

The instructions output by gcc -S do not tell me anything about instruction lengths or encodings, which is what I want to see.

A: 

It sounds to me like you want a disassembler. objdump is pretty much the standard (otool on Mac OS X); in concert with whatever map file information your linker gives you, the disassembly of your object file should give you everything you want.

Carl Norum
A: 

gcc will produce an assembly language source file. You can then use as -a yourfile.S to produce a listing that includes offsets and encoded bytes for each instruction. -a also has some sub-options to control what shows up in the listing file (as --help will give a list of them along with the other available options).

Jerry Coffin
A: 
nasm -f elf xx.asm -l x.lst

gcc xx.c xx.o -o xx

generates a 'list' file x.lst which is only for xx.asm

for xx.c along with xx.asm you can compile them both and then use 'gdb' - gnu debugger

john
+1  A: 

I like objdump for this, but the most useful options are non-obvious - especially if you're using it on an object file which contains relocations, rather than a final binary.

objdump -d some_binary does a reasonable job.

objdump -d some_object.o is less useful because calls to external functions don't get disassembled helpfully:

...
00000005 <foo>:
   5:   55                      push   %ebp
   6:   89 e5                   mov    %esp,%ebp
   8:   53                      push   %ebx
...
  29:   c7 04 24 00 00 00 00    movl   $0x0,(%esp)
  30:   e8 fc ff ff ff          call   31 <foo+0x2c>
  35:   89 d8                   mov    %ebx,%eax
...

The call is actually to printf()... adding the -r flag helps with that; it marks relocations. objdump -dr some_object.o gives:

...
  29:   c7 04 24 00 00 00 00    movl   $0x0,(%esp)
                        2c: R_386_32    .rodata.str1.1
  30:   e8 fc ff ff ff          call   31 <foo+0x2c>
                        31: R_386_PC32  printf
...

Then, I find it useful to see each line annotated as <symbol+offset>. objdump has a handy option for that, but it has the annoying side effect of turning off the dump of the actual bytes - objdump --prefix-addresses -dr some_object.o gives:

...
00000005 <foo> push   %ebp
00000006 <foo+0x1> mov    %esp,%ebp
00000008 <foo+0x3> push   %ebx
...

But it turns out that you can undo that by providing another obscure option, finally arriving at my favourite objdump incantation:

objdump --prefix-addresses --show-raw-insn -dr file.o

which gives output like this:

...
00000005 <foo> 55                       push   %ebp
00000006 <foo+0x1> 89 e5                        mov    %esp,%ebp
00000008 <foo+0x3> 53                           push   %ebx
...
00000029 <foo+0x24> c7 04 24 00 00 00 00        movl   $0x0,(%esp)
                        2c: R_386_32    .rodata.str1.1
00000030 <foo+0x2b> e8 fc ff ff ff              call   00000031 <foo+0x2c>
                        31: R_386_PC32  printf
00000035 <foo+0x30> 89 d8                       mov    %ebx,%eax
...

And if you've built with debugging symbols (i.e. compiled with -g), and you replace the -dr with -Srl, it will attempt to annotate the output with the corresponding source lines.

Matthew Slattery
+1  A: 

The easiest way to get a quick listing is to use the -a option to the assembler, which you can do by putting -Wa,-a on the gcc command line. You can use various modifiers to the a option to affect exactly what comes out -- see the as(1) man page.

Chris Dodd