views:

97

answers:

3

I've got some assembly of a sorting algorithm and I want to figure out how exactly it functions.

I'm a little confused on some of the instructions, particularly the cmp and jle instructions, so I'm looking for help. This assembly sorts an array of three elements.

0.00 :        4009f8:       48 8b 07                mov    (%rdi),%rax
0.00 :        4009fb:       48 8b 57 08             mov    0x8(%rdi),%rdx
0.00 :        4009ff:       48 8b 4f 10             mov    0x10(%rdi),%rcx
0.00 :        400a03:       48 39 d0                cmp    %rdx,%rax
0.00 :        400a06:       7e 2b                   jle    400a33 <b+0x3b>
0.00 :        400a08:       48 39 c8                cmp    %rcx,%rax
0.00 :        400a0b:       7e 1a                   jle    400a27 <b+0x2f>
0.00 :        400a0d:       48 39 ca                cmp    %rcx,%rdx
0.00 :        400a10:       7e 0c                   jle    400a1e <b+0x26>
0.00 :        400a12:       48 89 0f                mov    %rcx,(%rdi)
0.00 :        400a15:       48 89 57 08             mov    %rdx,0x8(%rdi)
0.00 :        400a19:       48 89 47 10             mov    %rax,0x10(%rdi)
0.00 :        400a1d:       c3                      retq
0.00 :        400a1e:       48 89 17                mov    %rdx,(%rdi)
0.00 :        400a21:       48 89 4f 08             mov    %rcx,0x8(%rdi)
0.00 :        400a25:       eb f2                   jmp    400a19 <b+0x21>
0.00 :        400a27:       48 89 17                mov    %rdx,(%rdi)
0.00 :        400a2a:       48 89 47 08             mov    %rax,0x8(%rdi)
0.00 :        400a2e:       48 89 4f 10             mov    %rcx,0x10(%rdi)
0.00 :        400a32:       c3                      retq
0.00 :        400a33:       48 39 ca                cmp    %rcx,%rdx
0.00 :        400a36:       7e 1d                   jle    400a55 <b+0x5d>
0.00 :        400a38:       48 39 c8                cmp    %rcx,%rax
0.00 :        400a3b:       7e 0c                   jle    400a49 <b+0x51>
0.00 :        400a3d:       48 89 0f                mov    %rcx,(%rdi)
0.00 :        400a40:       48 89 47 08             mov    %rax,0x8(%rdi)
0.00 :        400a44:       48 89 57 10             mov    %rdx,0x10(%rdi)
0.00 :        400a48:       c3                      retq
0.00 :        400a49:       48 89 07                mov    %rax,(%rdi)
0.00 :        400a4c:       48 89 4f 08             mov    %rcx,0x8(%rdi)
0.00 :        400a50:       48 89 57 10             mov    %rdx,0x10(%rdi)
0.00 :        400a54:       c3                      retq
0.00 :        400a55:       48 89 07                mov    %rax,(%rdi)
0.00 :        400a58:       48 89 57 08             mov    %rdx,0x8(%rdi)
0.00 :        400a5c:       48 89 4f 10             mov    %rcx,0x10(%rdi)
0.00 :        400a60:       c3                      retq
0.00 :        400a61:       90                      nop

If someone can walk me through it, it'd be very helpful. I kind of get confused around the operands like 0x8(%rdi) and the cmp and jle instructions. Thanks.

+3  A: 

Here are what the instructions mean:

mov : move
cmp : compare
jle : jump if less or equal (branch)
ret : return from procedure
nop : no-op

%r** are registers. They are usually %e** (eg: %eax, %edx, ...), but these are 64-bit registers.

As far as de-compiling the whole thing, that will take some more work.

See this: http://www.x86-64.org/documentation/assembly

NullUserException
I got into the habit very early (on the 6502) of thinking of cmp as a subtraction that discards its result. IIRC (this is going back a long way) jle on x86 is a bit ambiguous - there's two instructions that do what sounds like the same thing, but one is correct for signed values and one correct for unsigned. I don't recall which jle is. Of course I'm massively out of date - this is just my way of saying that IMO expanding the acronyms isn't really enough to explain much. The link does a fair bit to fix that, of course.
Steve314
Usually, LT/GT/LE/GE are signed and HI/LO/HS/LS are unsigned. There's also overflow set/clear (I forget if it's O, V, or X) and carry set/clear (CC/CS). I've specified them for ARM, but I suspect x86 is similar; the only significantly different (modern two's complement) processor I know of is PPC, where there are signed/unsigned/floating-point compares, because IEEE floating-point compares don't easily map to the CVNZ (carry, overflow, negative, zero) status bits.
tc.
+4  A: 

It helps to replace the register names with proper names to trace the flow of data, and add branch labels for the control flow.

0.00 :        4009f8:       48 8b 07                mov    (%argptr),%var1
0.00 :        4009fb:       48 8b 57 08             mov    0x8(%argptr),%var2
0.00 :        4009ff:       48 8b 4f 10             mov    0x10(%argptr),%var3
0.00 :        400a03:       48 39 d0                cmp    %var2,%var1
0.00 :        400a06:       7e 2b                   jle    @v2le1
0.00 :        400a08:       48 39 c8                cmp    %var3,%var1
0.00 :        400a0b:       7e 1a                   jle    @v3le1
0.00 :        400a0d:       48 39 ca                cmp    %var3,%var2
0.00 :        400a10:       7e 0c                   jle    @v3le2

   # Now we know that 2 > 1 and 3 > 1 and 3 > 2. Write them to memory in order.

      etc
Potatoswatter
Oh wow that makes it a lot clearer!
CCSab
Some disassemblers can do these things for you, at least semi-automatically. There use to be a free version of IDA Pro, which was pretty near the best you can get (on Windows) at one point. Sadly, it'll only support 32-bit even if you can still get it, I expect - and the full-price version probably isn't cheap. It might be worth trying the demo for a bit, though, or looking for an alternative disassembler with some similar features.
Steve314
+1  A: 
0.00 :        4009f8:       48 8b 07                mov    (%rdi),%rax 

The register RDI contains the address to a location in memory of your array. This line above copy the contents of the RAX register into the first element of your array. Since pointers in x64 are 0x8 bytes, the following two lines:

0.00 :        4009fb:       48 8b 57 08             mov    0x8(%rdi),%rdx 
0.00 :        4009ff:       48 8b 4f 10             mov    0x10(%rdi),%rcx 

will copy the contents of the RDX and RCX registers into the second and third elements into your array respectively. Now we need to start comparing the values to see where we need to swap.

0.00 :        400a03:       48 39 d0                cmp    %rdx,%rax 
0.00 :        400a06:       7e 2b                   jle    400a33 <b+0x3b> 

The cmp will compare the value of RDX with the value of RAX (essentially array[1] against array[0]). If RDX is less than or equal to RAX then this will jump program execution directly to line 400a33. You can think of this as if (array[1] > array[0]). And so it continues, comparing the values.

It appears that the code is trying to sort in descending order. Very roughly, the C code will probably look something like this:

array[0] = rax;
array[1] = rdx;
array[2] = rcx;

if (rdx > rax)
{
    if (rcx > rax)
    {
        if (rcx > rdx)
        {
            rcx = array[0];
            rdx = array[1];
LABEL:
            rax = array[2];
        }
        else
        {
            rdx = array[0];
            rcx = array[1];
            GOTO LABEL;
        }
    }
    else
    {
        rdx = array[0];
        rax = array[1];
        rcx = array[2];

    }
}
else
{
    if (rcx > rdx)
    {
        if (rcx > rax)
        {
            rcx = array[0];
            rax = array[1];
            rdx = array[2];
        }
        else
        {
            rax = array[0];
            rdx = array[1];
            rcx = array[2];
        }
    }
    else
    {
        rax = array[0];
        rdx = array[1];
        rcx = array[2];
    }
}
nithins