views:

131

answers:

3

I want to convert the for loop in the following code into assembly but i am not sure how to start. An explanation of how to do it and why it works would be appreciated.

I am using VS2010, C++, writing for the x86. The code is as follows:

for (n = 0; norm2 < 4.0 && n < N; ++n) 
{
    __asm{
    ///a*a - b*b + x
        fld a // a
        fmul st(0), st(0) // aa
        fld b // b aa
        fmul st(0), st(0) // bb aa
        fsub // (aa-bb) // st(0) - st(1)
        fld x // x (aa-bb)
        fadd // (aa-bb+x)

    /// 2.0*a*b + y;
        fld d // d (aa-bb+x)
        fld a // d a (aa-bb+x)
        fmul // ad (aa-bb+x)
        fld b // b ad (aa-bb+x)
        fmul // abd (aa-bb+x)
        fld y // y adb (aa-bb+x)
        fadd // b:(adb+y) a:(aa-bb+x)

        fld st(0) //b b:(adb+y) a:(aa-bb+x)
        fmul st(0), st(0) // bb b:(adb+y) a:(aa-bb+x)
        fld st(2) // a bb b:(adb+y) a:(aa-bb+x)
        fmul st(0), st(0) // aa bb b:(adb+y) a:(aa-bb+x)
        fadd // aa+bb b:(adb+y) a:(aa-bb+x)
        fstp norm2 // store aa+bb to norm2, st(0) is popped.
        fstp b
        fstp a
    }
}
+3  A: 

The quickest and easiest way to get a running start on this kind of problem is to first write the code in C or C++ in as simple a form as possible, then use your C/C++ compiler to generate asm. You can then use this generated asm as a template for your own asm code. With a proper compiler like gcc you would use gcc -S to do this. I'm pretty sure Visual Studio has a similar option buried somewhere in its GUI (apparently the command line switch is /Fa).

Paul R
A: 

The for loop is roughly the same as

if norm2>=4.0 then  // note condition inversed.
  goto end;
if 0<N then
  goto end; 
beginloop:

  <asm block>

   if norm2>=4.0 then  // note condition inversed.
     goto end;
   if (n<N)  then
     goto beginloop
end:
Marco van de Voort
+1  A: 

I won't write asm here, but three things you should investigate:

  • keep everything in registers

  • don't recompute a^2 and b^2 for a^2-b^2 when you have already computed them for a^2 + b^2

  • try to find a condition which allows setting n to N without iterating

AProgrammer
a and b change, so it is necessary to re-compute. double c = a*a - b*b + x; b = 2.0*a*b + y; a = c; norm2 = a*a + b*b;
aCuria
@aCuria, unroll the loop and you'll see that the a*a and b*b used to compute norm2 are those used to compute the next c.
AProgrammer