views:

39

answers:

1

I've been running som eprofile tests of a slow area of code. This is with Visual Studio 2008 and .NET 2 (fully patched). About 32% of my computation is used by the Haversine formula. This requires two sines, two cosines, a square root, and an arc sine - all using the standard .NET Math library (ie. Math.Sin, Math.Asin, Math.Sqrt). I've been able to easily cache the cosines - resulting in a roughly 25-30% speedup of the Haversine function.

In the profile I'm seeing __CIasin_pentium4 and __CIasin neither of which find much on Google except for things like stack dumps that people have posted. The pentium4 variant grabs about twice as many samples (both inclusive and exclusive). I'm assuming this is an arc sine, but is it really so much more expensive than a sine? There is no sign of a sine in the profile even though twice as many will be computed.

Are both of these functions arcsines, or is one a sine? If not, what do they represent?

Yes I've seen various articles and posts on the Internet and here about fast sines. I really do need the accuracy of a computed sine rather than a look up table or truncated Taylor series. I'm using the Haversine to compute and/or compare distances on the Earth's surface. 10m accuracy (the minimum IMHO for my app) equates to about 1/640000 radians.

One thought for speed is to multiple out the trigonometric identities. Although this would result in more trig functions, they would become dependent on individual end points only and hence become cacheable. Another is to unwrap the arcsine and the square root for my comparisons. I think the latter has a lot of scope for improvement, however at the moment I am trying to understand what is taking the processing time and exactly what the __CIasin functions represent.

A: 

Looks like the Pentium FPU has native instructions for sine and cosine (fsin and fcos), but not for arcsine. Hence the __CIasin functions that I am seeing are probably the .NET implementation of arcsine, which I understand uses a Taylor series. This explains the big difference in speed, so that asin shows up but sin does not. (or cos or sqrt for that matter - these are native functions too).

I have coded x86 FPUs directly long ago. So long ago, I think it must have been an 8087 - anyway the only trig present in those days was a partial tangent!

So the next job in the optimization is to unwrap the arcsine and square root from the Haversine where possible. The results are used for simple greater than/smaller than comparisons (sorting, etc); and to compare against "fixed" values. In both cases, it should be possible to unwrap these. Eg. the fixed value becomes square( sin( fixed ) ), and is compared against what was inside the sqrt.

I still think the trig identities might be a useful optimization but it would definitely complicate the code and introduce the possibility of errors.

winwaed