views:

1163

answers:

12

I know this sounds like a homework assignment, but it isn't. Lately I've been interested in algorithms used to perform certain mathematical operations, such as sine, square root, etc. At the moment, I'm trying to write the Babylonian method of computing square roots in C#.

So far, I have this:

public static double SquareRoot(double x) {
    if (x == 0) return 0;

    double r = x / 2; // this is inefficient, but I can't find a better way
                      // to get a close estimate for the starting value of r
    double last = 0;
    int maxIters = 100;

    for (int i = 0; i < maxIters; i++) {
        r = (r + x / r) / 2;
        if (r == last)
            break;
        last = r;
    }

    return r;
}

It works just fine and produces the exact same answer as the .NET Framework's Math.Sqrt() method every time. As you can probably guess, though, it's slower than the native method (by around 800 ticks). I know this particular method will never be faster than the native method, but I'm just wondering if there are any optimizations I can make.

The only optimization I saw immediately was the fact that the calculation would run 100 times, even after the answer had already been determined (at which point, r would always be the same value). So, I added a quick check to see if the newly calculated value is the same as the previously calculated value and break out of the loop. Unfortunately, it didn't make much of a difference in speed, but just seemed like the right thing to do.

And before you say "Why not just use Math.Sqrt() instead?"... I'm doing this as a learning exercise and do not intend to actually use this method in any production code.

+2  A: 

Instead of breaking the loop and then returning r, you could just return r. May not provide any noticable increase in performance.

AlbertoPL
Bit shifting works fine for int (etc) - does it work for double, though? It doesn't even seem to be defined...
Marc Gravell
The "break/return" vs "return" is *so* minimal; I don't think that you'll ever spot the difference here
Marc Gravell
He's trying to save ticks, so I'm suggesting even the most trivial things.
AlbertoPL
+1! I wish I could mark two answers. Using "return r" instead of "break" definitely made a speed difference (albeit very small, but as you said, I'm working in ticks here).
David Brown
A: 

You can try r = x >> 1;

instead of / 2 (also in the other place you device by 2). It might give you a slight edge. I would also move the 100 into the loop. Probably nothing, but we are talking about ticks in here.

just checking it now.

EDIT: Fixed the > into >>, but it doesn't work for doubles, so nevermind. the inlining of the 100 gave me no speed increase.

Noam Gal
I think that won't work, because x>1 will be "true" or "false" it should be >> instead.
Peter
+4  A: 

What you are doing here is you execute Newton's method of finding a root. So you could just use some more efficient root-finding algorithm. You can start searching for it here .

Best wishes,

Jasiu

Jasiu
+1, algorithmic improvements generally trump micro-optimizations such as replace division with bit-shift.
Richard
I don't see how "use a different algorithm" is a good answer for "how do I perform this algorithm faster?" He already explained that he's just doing this because he wants to, so telling him to do something else entirely is not a useful suggestion.
Chad Birch
Newton's method converges fast and isn't the problem at all. The real problem is the first approximation. C# appears not to allow the bit fiddling that is possible in C/C++.
Accipitridae
+2  A: 

Replacing the division by 2 with a bit shift is unlikely to make that big a difference; given that the division is by a constant I'd hope the compiler is smart enough to do that for you, but you may as well try it to see.

You're much more likely to get an improvement by exiting from the loop early, so either store new r in a variable and compare with old r, or store x/r in a variable and compare that against r before doing the addition and division.

Richard
+1  A: 

Define a tolerance and return early when subsequent iterations fall within that tolerance.

Sinan Ünür
+6  A: 

First, instead of checking for equality (r == last), you should be checking for convergence, wherein r is close to last, where close is defined by an arbitrary epsilon:

eps = 1e-10  // pick any small number
if (Math.Abs(r-last) < eps) break;

As the wikipedia article you linked to mentions - you don't efficiently calculate square roots with Newton's method - instead, you use logarithms.

Ankur Goel
Comparing doubles is never a good idea.
Carra
typo: s/"Newton's method"/"Babylonian method" -- Newton's method works fine for speed of convergence (with some caveats on whether it converges)
Jason S
If the root is larger than 2^52*eps then it is well possible that r oscillates around the root and that Math.Abs(r-last) is never smaller than eps. Hence, while this proposal is a little better than the original program it may still lead to many more iterations than necessary.
Accipitridae
This actually shaves off about 100 ticks, which seems odd because it's executing an additional method along with a comparison. But, I won't complain. ;)
David Brown
@David: I'm sure Math.Abs is inlined by the JIT
BlueRaja - Danny Pflughoeft
A: 

Well, the native Sqrt() function probably isn't implemented in C#, it'll most likely be done in a low-level language, and it'll certainly be using a more efficient algorithm. So trying to match its speed is probably futile.

However, in regard to just trying to optimize your function for the heckuvit, the Wikipedia page you linked recommends the "starting guess" to be 2^floor(D/2), where D represents the number of binary digits in the number. You could give that an attempt, I don't see much else that could be optimized significantly in your code.

Chad Birch
A: 

Since you said the code below was not fast enough, try this:

    static double guess(double n)
    {
        return Math.Pow(10, Math.Log10(n) / 2);
    }

It should be very accurate and hopefully fast.

Here is code for the initial estimate described here. It appears to be pretty good. Use this code, and then you should also iterate until the values converge within an epsilon of difference.

    public static double digits(double x)
    {
        double n = Math.Floor(x);
        double d;

        if (d >= 1.0)
        {
            for (d = 1; n >= 1.0; ++d)
            {
                n = n / 10;
            }
        }
        else
        {
            for (d = 1; n < 1.0; ++d)
            {
                n = n * 10;
            }
        }


        return d;
    }

    public static double guess(double x)
    {
        double output;
        double d = Program.digits(x);

        if (d % 2 == 0)
        {
            output = 6*Math.Pow(10, (d - 2) / 2);
        }
        else
        {
            output = 2*Math.Pow(10, (d - 1) / 2);
        }

        return output;
    }
Unknown
It works, but takes 3 times longer to calculate than simply using x / 2.
David Brown
Do you mean it takes 3 times longer than x/2, or the whole program? Because this should give a far better estimate than x/2.
Unknown
+1  A: 

With your method, each iteration doubles the number of correct bits.

Using a table to obtain the initial 4 bits (for example), you will have 8 bits after the 1st iteration, then 16 bits after the second, and all the bits you need after the fourth iteration (since a double stores 52+1 bits of mantissa).

For a table lookup, you can extract the mantissa in [0.5,1[ and exponent from the input (using a function like frexp), then normalize the mantissa in [64,256[ using multiplication by a suitable power of 2.

mantissa *= 2^K
exponent -= K

After this, your input number is still mantissa*2^exponent. K must be 7 or 8, to obtain an even exponent. You can obtain the initial value for the iterations from a table containing all the square roots of the integral part of mantissa. Perform 4 iterations to get the square root r of mantissa. The result is r*2^(exponent/2), constructed using a function like ldexp.

EDIT. I put some C++ code below to illustrate this. The OP's function sr1 with improved test takes 2.78s to compute 2^24 square roots; my function sr2 takes 1.42s, and the hardware sqrt takes 0.12s.

#include <math.h>
#include <stdio.h>

double sr1(double x)
{
  double last = 0;
  double r = x * 0.5;
  int maxIters = 100;
  for (int i = 0; i < maxIters; i++) {
    r = (r + x / r) / 2;
    if ( fabs(r - last) < 1.0e-10 )
      break;
    last = r;
  }
  return r;
}

double sr2(double x)
{
  // Square roots of values in 0..256 (rounded to nearest integer)
  static const int ROOTS256[] = {
    0,1,1,2,2,2,2,3,3,3,3,3,3,4,4,4,4,4,4,4,4,5,5,5,5,5,5,5,5,5,5,6,6,6,6,6,6,6,6,6,6,6,6,
    7,7,7,7,7,7,7,7,7,7,7,7,7,7,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,9,9,9,9,9,9,9,9,9,9,9,9,9,
    9,9,9,9,9,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,11,11,11,11,11,
    11,11,11,11,11,11,11,11,11,11,11,11,11,11,11,11,11,12,12,12,12,12,12,12,12,12,12,12,12,
    12,12,12,12,12,12,12,12,12,12,12,12,13,13,13,13,13,13,13,13,13,13,13,13,13,13,13,13,13,
    13,13,13,13,13,13,13,13,13,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,
    14,14,14,14,14,14,14,14,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,
    15,15,15,15,15,15,15,15,15,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16 };

  // Normalize input
  int exponent;
  double mantissa = frexp(x,&exponent); // MANTISSA in [0.5,1[ unless X is 0
  if (mantissa == 0) return 0; // X is 0
  if (exponent & 1) { mantissa *= 128; exponent -= 7; } // odd exponent
  else { mantissa *= 256; exponent -= 8; } // even exponent
  // Here MANTISSA is in [64,256[

  // Initial value on 4 bits
  double root = ROOTS256[(int)floor(mantissa)];

  // Iterate
  for (int it=0;it<4;it++)
    {
      root = 0.5 * (root + mantissa / root);
    }

  // Restore exponent in result
  return ldexp(root,exponent>>1);
}

int main()
{
  // Used to generate the table
  // for (int i=0;i<=256;i++) printf(",%.0f",sqrt(i));

  double s = 0;
  int mx = 1<<24;
  // for (int i=0;i<mx;i++) s += sqrt(i); // 0.120s
  // for (int i=0;i<mx;i++) s += sr1(i);  // 2.780s
  for (int i=0;i<mx;i++) s += sr2(i);  // 1.420s
}
Eric Bainville
Does frexp and ldexp exist in C#? I'm not aware of these function or anything that could replace them. The problem with the solution of the OP is that in C# it is difficult to find an initial approximation.
Accipitridae
Use Jon Carmack's magic number for approximation: http://www.codemaestro.com/reviews/9
Ankur Goel
I found C# versions of frexp and ldexp on Google Code Search, but this example is actually much slower than my original code. Of course, this could also be a problem with the implementation of frexp and ldexp. I find this code really interesting, however. Thanks for posting it!
David Brown
A: 

Hi David,

I have been looking at this as well for learning purposes. You may be interested in two modifications I tried.

The first was to use a first order taylor series approximation in x0:

    Func<double, double> fNewton = (b) =>
    {
        // Use first order taylor expansion for initial guess
        // http://www27.wolframalpha.com/input/?i=series+expansion+x^.5
        double x0 = 1 + (b - 1) / 2;
        double xn = x0;
        do
        {
            x0 = xn;
            xn = (x0 + b / x0) / 2;
        } while (Math.Abs(xn - x0) > Double.Epsilon);
        return xn;
    };

The second was to try a third order (more expensive), iterate

    Func<double, double> fNewtonThird = (b) =>
    {
        double x0 = b/2;
        double xn = x0;
        do
        {
            x0 = xn;
            xn = (x0*(x0*x0+3*b))/(3*x0*x0+b);
        } while (Math.Abs(xn - x0) > Double.Epsilon);
        return xn;
    };

I created a helper method to time the functions

public static class Helper
{
    public static long Time(
        this Func<double, double> f,
        double testValue)
    {
        int imax = 120000;
        double avg = 0.0;
        Stopwatch st = new Stopwatch();
        for (int i = 0; i < imax; i++)
        {
            // note the timing is strictly on the function
            st.Start();
            var t = f(testValue);
            st.Stop();
            avg = (avg * i + t) / (i + 1);
        }
        Console.WriteLine("Average Val: {0}",avg);
        return st.ElapsedTicks/imax;
    }
}

The original method was faster, but again, might be interesting :)

ccook
+1  A: 
float InvSqrt (float x){
  float xhalf = 0.5f*x;
  int i = *(int*)&x;
  i = 0x5f3759df - (i>>1);
  x = *(float*)&i;
  x = x*(1.5f - xhalf*x*x);
  return x;}

This is my favorite fast square root. Actually it's the inverse of the square root, but you can invert it after if you want....I can't say if it's faster if you want the square root and not the inverse square root, but it's freaken cool just the same.
http://www.beyond3d.com/content/articles/8/

Chris H
Frickin' crazy, though I don't think this is even possible in C#
BlueRaja - Danny Pflughoeft
It works fine in C# if you place it in an unsafe block.
David Brown
A: 

Replacing "/ 2" by "* 0.5" makes this ~1.5 times faster on my machine, but of course not nearly as fast as the native implementation.

Kristof Verbiest