views:

517

answers:

8

I read in an article somewhere that trig calculations are generally expensive. Is this true? And if so, that's why they use trig-lookup tables right?

EDIT: Hmm, so if the only thing that changes is the degrees (accurate to 1 degree), would a look up table with 360 entries (for every angle) be faster?

+5  A: 

Since sin(), cos() and tan() are mathematical functions which are calculated by summing a series developers will sometimes use lookup tables to avoid the expensive calculation.

The tradeoff is in accuracy and memory. The greater the need for accuracy, the greater the amount of memory required for the lookup table.

Take a look at the following table accurate to 1 degree.

http://www.analyzemath.com/trigonometry/trig_1.gif

Codebrain
So pretty much all mathematical functions (beside simple + and -, etc) are generally quite expensive?
DMan
All floating point operations are quite expensive. Even + involves quite a number of comparison, integer additions and bit shifts. Of course much cheaper than sin, cos, tan.
KennyTM
@Kenny - yes, you are right, since they are inherently not represented as integers :)
Codebrain
@KennyTM:That's not right. They *are* slower than integer operations, but not that much. A floating point addition usually takes 3 clock cycles on a typical contemporary CPU. A sin() function takes ~200 cycles (depending on the CPU and method). I hope you DO see a difference?
slacker
@Codebrain:Actually, "to the power of 2" requires a full multiplication. Wonder why noone spotted this one yet? :) I believe you meant "multiply by 2".
slacker
@Codebrain, KennyTM: on the x86 arch. you may be correct (I am not familiar with instruction latencies on x86). However, other embedded processors that support floating point can generate a FP result in the same latency as the integer units. So, it is not necessarily a matter of "inherently not represented as integers".Actually, to some extent, multiplying two 24 bit numbers (the mantissa parts of the FP) can be done faster than two 32 bit integers on an optimized hardware. The addition of the exponent fields is done in parallel to the multiplication of the mantissas using a small adder.
ysap
A: 

yes it is. trig functions are computed by summing up a series. So in general terms, it would be a lot more costly then a simple mathematical operation. same goes for sqrt

Midhat
Taylor expansion is NOT how modern FPUs compute trig functions. They use successive approximation methods which provide many more digits of accuracy per iteration than Taylor series. (Removed previous reference to CORDIC which is used in embedded applications where space is more important than speed)
Ben Voigt
A: 

If you always know the angles you are computing, you can store them in a variable instead of calculating them every time. This also applies within your method/function call where your angle is not going to change. You can be smart by using some formulas (calculating sin(theta) from sin(theta/2), knowing how often the values repeat - sin(theta + 2*pi*n) = sin(theta)) and reducing computation. See this wikipedia article

ram
+2  A: 

While the quick answer is that they are more expensive than the primitive math functions (addition/multiplication/subtraction etc...) they are not -expensive- in terms of human time. Typically the reason people optimize them with look-up tables and approximations is because they are calling them potentially tens of thousands of times per second and every microsecond could be valuable.

If you're writing a program and just need to call it a couple times a second the built-in functions are fast enough by far.

Ron Warholic
+6  A: 

Expensive is a relative term.

The mathematical operations that will perform fastest are those that can be performed directly by your processor. Certainly integer add and subtract will be among them. Depending upon the processor, there may be multiplication and division as well. Sometimes the processor (or a co-processor) can handle floating point operations natively.

More complicated things (e.g. square root) requires a series of these low-level calculations to be performed. These things are usually accomplished using math libraries (written on top of the native operations your processor can perform).

All of this happens very very fast these days, so "expensive" depends on how much of it you need to do, and how quickly you need it to happen.

If you're writing real-time 3D rendering software, then you may need to use lots of clever math tricks and shortcuts to squeeze every bit of speed out of your environment.

If you're working on typical business applications, odds are that the mathematical calculations you're doing won't contribute significantly to the overall performance of your system.

Scott Smith
Actually, square root is so common that it's very often implemented in hardware. With more complicated functions (e.g. trig) there's not much advantage of implementing them in hardware, though it DID happen in some architectures (x87 would be the best-known)
slacker
@slacker - when you say "in hardware" do mean to imply that FSQRT is a small number of clock cycles, or do you simply mean it's a single instruction and implemented in nano/microcode? I know there are hardware designs for square root feature, but I didn't think they were in most processors.
NVRAM
+1  A: 

I would recommend writing a test program and timing them for yourself. Yes, they're slow compared to plus and minus, but they're still single processor instructions. It's unlikely to be an issue unless you're doing a very tight loop with millions of iterations.

Mark Ransom
A: 

Yes, (relative to other mathematical operations multiply, divide): if you're doing something realtime (matrix ops, video games, whatever), you can knock off lots of cycles by moving your trig calculations out of your inner loop.

If you're not doing something realtime, then no, they're not expensive (relative to operations such as reading a bunch of data from disk, generating a webpage, etc.). Trig ops are hopefully done in hardware by your CPU (which can do billions of floating point operations per second).

Seth
+2  A: 

On the Intel x86 processor, floating point addition or subtraction requires 6 clock cycles, multiplication requires 8 clock cycles, and division 30-44 clock cycles. But cosine requires between 180 and 280 clock cycles.

It's still very fast, since the x86 does these things in hardware, but it's much slower than the more basic math functions.

Jeffrey L Whitledge
Actually, that's quite outdated information. These days FP additions take 3-4 cycles, and FP multiplications 4-5 cycles, depending on the processor. And note that those operations are fully pipelined, so you can start a new addition and multiplication every clock cycle. Divisions typically take 20-25 cycles and are not pipelined. Newer processors can also bail out early of a division if the divisor is reasonably round - making it take as little as 6 cycles in some cases.
slacker
Unless you're talking about the Pentium 4. Which is just slow for whatever it does. Duh.
slacker