views:

39

answers:

1

So, we're studying MIPS architecture at school and we're implementing a MIPS32 architecture. I thought I'd use GNU cross-binutils as assembler but I'm getting weird output when dealing with instructions jal, j and jr. The assembler seems to insert the instructions at the wrong places. I have no idea why this happens, and I doubt the MIPS assembler would be that broken, so I assume this is supposed to happen.

Here is my dummy assembly file:

.section .text
.globl __start

__start:
   addi $a0, $0, 100
   addi $a1, $0, 200 
   jal test

test:
   add $v0, $a0, $a1
   jr $ra

However, when I disassemble I get this output:

Disassembly of section .text:

00000000 <__start>:
   0:   20040064    addi    a0,zero,100
   4:   0c000003    jal c <test>    <--- Why is jal coming before addi?
   8:   200500c8    addi    a1,zero,200

0000000c <test>:
   c:   03e00008    jr  ra  <--- Why is jr coming before add?
  10:   00851020    add v0,a0,a1
    ...

Is this some architectural quirk? If so, what is the rationale behind this?

EDIT: Tested adding some nop's just for the heck ...

.section .text
.globl __start

__start:
   addi $a0, $0, 100
   addi $a1, $0, 200 
   nop
   jal test

test:
   add $v0, $a0, $a1
   nop
   jr $ra

and it gives me something that seems somewhat correct.

Disassembly of section .text:

00000000 <__start>:
   0:   20040064    addi    a0,zero,100
   4:   200500c8    addi    a1,zero,200
   8:   0c000004    jal 10 <test>
   c:   00000000    nop

00000010 <test>:
  10:   00851020    add v0,a0,a1
  14:   03e00008    jr  ra
  18:   00000000    nop
  1c:   00000000    nop

Why are jal and j swapping places with the last instruction?

+4  A: 

MIPS has explicit pipeline hazards; the instruction immediately following a branch or jump instruction will always be executed (this instruction is sometimes referred to as the "branch delay slot"). If your code was really assembled exactly as you wrote it:

__start:
   addi $a0, $0, 100
   addi $a1, $0, 200 
   jal test

test:
   add $v0, $a0, $a1
   jr $ra

then the add instruction would be executed twice around the time that the jal happens: once in the delay slot, and once on the following cycle when the program counter change has actually taken effect.

By default, the GNU assembler reorders instructions for you: it is clear that the second addi must always be executed, so it can be swapped with the jal instruction, so that the addi moves into the delay slot. (In cases where the assembler can't deduce that it is safe to do this, it will insert a nop into the delay slot instead.)

If you don't want it to do this reordering for you, add the directive

.set noreorder

at the top of your source file. You must deal with the hazards yourself in this case. If you do this, I recommend annotating the delay slots so that they stand out - e.g. by adding an extra space (or two) of indentation. For example:

.set noreorder

__start:
   addi $a0, $0, 100
   jal test
     addi $a1, $0, 200 

test:
   add $v0, $a0, $a1
   jr $ra
     nop
Matthew Slattery
I see. Thanks. Our implementation is single cycle, so I'll use .set noreorder :).
Maister
As a historical note, compilers traditionally just filled the branch delay slot with a nop. I'm not sure when things have changed to "just let the assembler worry about it".
tc.