views:

192

answers:

8

Hello- I am writing an iPhone app that needs to calculate the square root of a number about 2000 times every 1/30th of a second. sqrt() works fine on a computer, but the frame rate drops to around 10 FPS on an iPhone or iPad, and I have already optimized the rest of the code. I have heard that this can be sped up dramatically by estimating the square root, but I can not find any code to do this. I only need one or two decimal places of precision. Any suggestions on how to do this, or other ways to speed things up would be appreciated.

Thanks!

+4  A: 

Unless you actually need the square root, compare the squared values rather than the raw values and the square root.

Squaring is much faster (and more accurate) than taking a square root, if you only need comparisons. This is the way most games do it.

Stefan Kendall
+1  A: 

Maybe this is for you:
Fast inverse square root
If this method doesn't provide the accuracy you need there are also alot of other iterative methods where you can choose more or less precise between speed and accuracy:
Methods of computing square roots

Daniel
Also note-worthy is that sqrt(x) = x*(1/sqrt(x)) and also 1/x = (1/sqrt(x))**2 so having a fast reciprocal square root is actually more useful than it first appears. In fact if I had to dedicate limited hardware to math I'd choose it over division :-)
phkahler
+2  A: 

Do you know the range of values that you are trying to find the square root of? Say you have values ranging from 0 to 10. You can then precalculate an array:

sqrt_val[0] = 0;
sqrt_val[1] = 1;
sqrt_val[2] = // the sqrt of 2
...
sqrt_val[10] = // the sqrt of 10

Then during runtime you take the number that you want the sqrt of, convert that to an integer (so for example 3.123 becomes 3) and use that as an index (3) to look up the precalculated value.

Of course if you want finer resolution you can just increase the number of items in your array.

No one in particular
Floating point to integer conversions are not particularly cheap either though.
dreamlax
If that is an issue then try using integers for the coordinates and do all calculations as ints.
No one in particular
+2  A: 

How accurate do you want your estimate to be? If you know how close you want your estimate to be to the real sqrt the Newton's Method is your friend.

Do you know the range of values that are passed to sqrt? If so you can make up a look up table that is precomputed at startup (or even read from disk at startup depending on what turns out to be faster). Find the closest in the table to your input and you get your estimate.

hhafez
Or especially Newton's Method + precomputation, most likely.
Stefan Kendall
+1  A: 

The easiest change you can make on an iPhone is to use sqrtf() instead of sqrt(). Single precision float math is much faster than double precision, especially on devices of 3GS vintage and newer.

hotpaw2
+1  A: 

First off, are you certain that square root is actually the bottleneck? Have you profiled? 2000 square roots every 1/30th of a second actually isn't all that many, even on a cell phone. The ARM documentation quotes 33 cycles for a single-precision square root and 60 cycles for double-precision; a 600mHz processor can do 10 million square roots per second (more if the instruction is pipelined at all).

If you have profiled, and square root really is the bottleneck, you will want to use the NEON vrsqrte.f32 instruction. This instruction is quite fast and gives you the approximate reciprocal square roots of four floating-point numbers simultaneously. You can then use the vmul.f32 instruction to get approximate square roots (though for many uses the reciprocal is more useful than the square root itself).

Stephen Canon
A: 

If you have a "normal" positive float or double, not an int, and want to use a table look-up method, you can do two separate table look ups, one for the exponent (re-biased), and one for a few bits of the mantissa (shift and mask bitfield extraction), and then multiply the two table look up results together.

hotpaw2