Pre-calculation is the way to go if you want maximum speed. Since you have 216 (65,536) possible inputs, the naive approach would be to have an array of that many 16-bit values (for a total of 128K memory used).
But that can be improved on a little bit if you don't want to use all that memory (at the cost of a little speed).
Assuming your inputs are in radians (I'm assuming they are since your ranges for a 4.12 number is 0 through 15.999..., not enough to represent all degrees in a circle), you can take advantage of the facts that:
sin (n) = sin (n % (2*PI)) for n >= 2*PI
sin (n) = -sin (n - PI) for PI <= n < 2*PI
sin (n) = sin (PI - n) for PI/2 <= n < PI
So your lookup table only needs to hold the values between 0 and PI/2 radians, cutting storage requirements dramatically: you only store between 0 and PI/2 (~1.571) rather than your full range of 0 through 15.999..., a reduction of some 90%.
Then you just need to reduce values to below 2*PI radians with a modulo (first rule) and modify it with the other two rules to find the right index to lookup. Modulo will work on fixed point as quickly as on integers.
So you'd be looking at something like:
def sin(r):
if r >= PI_BY_2:
return sin (r % PI_BY_2)
if r >= PI:
return -sin (r - PI)
if r >= PI_DIV_2:
return sin (PI - r)
return sin_lookup[r]
def cos(r):
if r < PI_DIV_2:
return sin (r + PI_DIV_2_BY_3)
return sin (r - PI_DIV_2)
That cos
function shows how cheap it is to get cosines from the same table since they're really just a 90 degree phase shift from sines.
If speed is of extreme importance and memory is irrelevant, just use the full range of indexes so no calculations are required.
Another trick, if you're willing to sacrifice some accuracy, at the gain of using less memory, is to not store lookup values for all input values.
Given that the sine function is a smooth one (no discontinuities), you could store every second value and interpolate the values for those in between by simply averaging those on either side of it.
For example, if we had the function f(x) = x * x
, a table of the real and interpolated values follows (assume we only store the values for even x
, odd values of x
are interpolated, marked with *
below):
x real f(x) interpolated f(x)
-- --------- -----------------
0 0 0
1 1 2 *
2 4 4
3 9 10 *
4 16 16
5 25 26 *
6 36 36
7 49 50 *
8 64 64
9 91 82 *
10 100 100
Now, that's not a perfect example since the spreads between the f(x) values can be quite large but it works better for functions where the values are closer together.
For example, the real values of sin(0), sin(1) and sin(2) (degrees, not radians) are 0, 0.017452406 and 0.034899496 (and this is where the slope is the largest). The average of sin(0) and sin(2) is 0.017449748 which is an error of 0.0152% from the real value, one part in 6,500.
So, for a minimal error, you could halve your memory requirements to about 5% of 128K, or six and a half K.