tags:

views:

250

answers:

3

I'm trying to do a relative jump in x86 assembly, however I can not get it to work. It seems that for some reason my jump keeps getting rewritten as an absolute jump or something.

A simple example program for what I'm trying to do is this:

.global main

main:
    jmp 0x4
    ret

Since the jmp instruction is 4 bytes long and a relative jump is offset from the address of the jump + 1, this should be a fancy no-op. However, compiling and running this code will cause a segmentation fault.

The real puzzler for me is that compiling it to the object level and then disassembling the object file shows that it looks like the assembler is correctly doing a relative jump, but after the file gets compiled the linker is changing it into another type of jump.

For example if the above code was in a file called asmtest.s:

$gcc -c asmtest.s
$objdump -D asmtest.o

... Some info from objdump
00000000 <main>:
   0:    e9 00 00 00 00           jmp    5 <main+0x5>
   5:    c3                       ret   

This looks like the assembler correctly made a relative jump, although it's suspicious that the jmp instruction is filled with 0s.

I then used gcc to link it then disassembled it and got this:

$gcc -o asmtest asmtest.o
$objdump -d asmtest

...Extra info and other disassembled functions
08048394 <main>:
 8048394:        e9 6b 7c fb f7      jmp   4 <_init-0x8048274>
 8048399:        c3                  ret

This to me looks like the linker rewrote the jmp statement, or substituted the 5 in for another address.

So my question comes down to, what am I doing wrong?

Am I specifying the offset incorrectly? Am I misunderstanding how relative jumps work? Is gcc trying to make sure I don't do dangerous things in my code?

+2  A: 

Actually, the assembler thought that you were trying to do an absolute jump. However, the jmp opcode is, at the metal level, relative. Hence, the assembler could not know what to write after the 0xe9 byte, because the assembler does not know at which address your code will end up.

The assembler does not know, but the linker does. So the assembler wrote in the asmtest.o headers somewhere a request for the linker, something which goes like this: "when you know at which address the code will get loaded, adjust those bytes just after the 0xe9 so that they will be appropriate for a jump from that point (with relative addressing) to the absolute address '4'". The linker did just that. It saw that the 0xe9 was at address 0x08048394, and the next opcode at 0x08048399, and it computed: to go from 0x08048399 to 0x00000004, one has to subtract 0x08048395, which is equivalent to adding (on 32-bit machines) 0xf7fb7c6b. Hence your "6b 7c fb f7" sequence in the resulting binary.

You can encode a relative jump "manually" like this:

.global main
main:
    .byte 0xe9
    .long 0x4
    ret

Thus, the assembler will not notice that your 0xe9 is really a jmp, and it will not try to outsmart you. In the binary, you will get the 'e9 04 00 00 00' sequence that you wish, and no linker interaction.

Note that the code above may crash, because the relative offset is counted from the address immediately after the offset (i.e. the address of the next opcode, here ret). This will jump in the no-man's-land 4 bytes after the ret and a segfault or something strange seems likely.

Thomas Pornin
Jumping to a named label is how jumps are normally done in assembly. This is very basic. I assume that the OP already knows that and really wishes to hand-code a relative jump for some unspecified reason.
Thomas Pornin
@Thomas Pornin: I suspect that comment was in reply to mine. I deleted it as I moved my comment to an answer instead.
Fred Larson
Thanks for the help, this works exactly as I need it too. Also thank you for pointing out the problem with it jumping too far, I was a little unsure of exactly where the instruction is calculated from.
Ian Kelly
@Ian: Welcome to machine code programming (Assembly language is for pansies)! Please wait your turn to enter the program using the front panel switches. In the meantime, you can work on your parameterized delay function that uses the drum memory to time the delay: http://www.pbm.com/~lindahl/mel.html
Michael Burr
+3  A: 

I think the assembler is taking an absolute address and calculating the address offset for you. The zeros in the first case are probably there because it's part of a fixup table and the offset gets calculated in the link phase.

My assembly language skills are a bit rusty, but I think you could just do this:

.global main

main:
    jmp getouttahere
getouttahere:
    ret

Or if you really want it to look relative:

.global main

main:
    jmp .+5
    ret

Please be gentle if I'm wrong; it's been a long time.

Fred Larson
That makes sense that is trying to calculate an absolute address. Unfortunately for various reasons I can't use a label to do the jump, and the $+5 isn't valid syntax. Thank you for the help.
Ian Kelly
@Ian Kelly: Assembler syntax can vary. I think ".+5" might work.
Fred Larson
+3  A: 

If you're using GCC's GAS assembler which uses AT&T syntax by default, the syntax for relative addressing uses the dot ('.') to represent the current address being assembled (much like the $ pseudo-symbol is used in Intel/MASM assembly syntax). You should be able to get your relative jump using something like:

jmp . + 5

For example the following function:

void foo(void)
{
    __asm__ (
        "jmp .  + 5\n\t"
        "nop\n\t"
        "nop\n\t"
        "nop\n\t"
        "nop\n\t"
        "nop\n\t"

    );
}

Gets assembled to:

  71 0000 55            pushl   %ebp
  72 0001 89E5          movl    %esp, %ebp
  74                LM2:
  75                /APP
  76 0003 EB03          jmp .  + 5
  77 0005 90            nop
  78 0006 90            nop
  79 0007 90            nop
  80 0008 90            nop
  81 0009 90            nop
  82                    
  84                LM3:
  85                /NO_APP
  86 000a 5D            popl    %ebp
  87 000b C3            ret
Michael Burr