tags:

views:

305

answers:

1

Hi,
I recently needed to debug a program at assembly level. I don't have a lot of assembler experience, so I figured I'd write some simple C programs and single-step through them in order to get a feeling for the language before I'd start debugging other peoples code. However, I really don't get what gcc made of these two lines (compiled with -ggdb -O0):

items[tail] = i;
tail = (tail+1) % MAX_SIZE;

where MAX_SIZE is #defined to be 5 and i is a local variable (stored in 0x8(%ebp), i guess). According to gdb, this becomes:

0x08048394 <queue+17>:  mov    0x8049634,%edx
0x0804839a <queue+23>:  mov    0x8(%ebp),%eax
0x0804839d <queue+26>:  mov    %eax,0x804963c(,%edx,4)
0x080483a4 <queue+33>:  mov    0x8049634,%eax
0x080483a9 <queue+38>:  lea    0x1(%eax),%ecx
0x080483ac <queue+41>:  movl   $0x66666667,-0xc(%ebp)
0x080483b3 <queue+48>:  mov    -0xc(%ebp),%eax
0x080483b6 <queue+51>:  imul   %ecx
0x080483b8 <queue+53>:  sar    %edx
0x080483ba <queue+55>:  mov    %ecx,%eax
0x080483bc <queue+57>:  sar    $0x1f,%eax
0x080483bf <queue+60>:  mov    %edx,%ebx
0x080483c1 <queue+62>:  sub    %eax,%ebx
0x080483c3 <queue+64>:  mov    %ebx,-0x8(%ebp)
0x080483c6 <queue+67>:  mov    -0x8(%ebp),%eax
0x080483c9 <queue+70>:  shl    $0x2,%eax
0x080483cc <queue+73>:  add    -0x8(%ebp),%eax
0x080483cf <queue+76>:  mov    %ecx,%edx
0x080483d1 <queue+78>:  sub    %eax,%edx
0x080483d3 <queue+80>:  mov    %edx,-0x8(%ebp)
0x080483d6 <queue+83>:  mov    -0x8(%ebp),%ebx
0x080483d9 <queue+86>:  mov    %ebx,0x804963

Since 0x804963c is the address of items, i can see how the first line of C code works. Also, 0x8049634 is the address of tail, so I guess queue+33 and queue+38 are equivalent to %ecx = tail+1...but I have no idea what is happening afterwards. Who would have thought a simple modulo could be this complicated?

+13  A: 

It's a way to avoid having to do a more expensive division instruction. I was also quite stumped the first time I encountered this. The fun thing is that searching for the magic numbers that are used for this trick (in this case 0x66666667) often gives results explaining this trick. (I believe at the time it was the only concrete thing I had to go on because I didn't have the sources.)

A quick search gave me this blog post: http://blog.dkbza.org/2007/09/reverse-engineering-compiler-produced.html It has some useful links at the bottom (including an indirect link to a paper on this trick).

mweerden