Hi. I wish to make a cosine table at compile time. Is there a way to do this without hard coding anything?
Why not hardcode it? I am not aware of any changes in the result of the cosine function that they are planning, well not for another 100 years or so.
You could generate it with any scripting language you liked and include the result. Use make to have the scripting language do its thing anytime you change the source. It's hard coded to C but not to you, really.
With C++, you can use templates metaprogramming to generate your lookup table at runtime.
Now, here is a standard C trick that may or may not accomplish what you want.
- Write a program (say, cosgen) that generates the cosine table C statement (i.e., the code that you desire).
- Run cosgen and dump the output (c code) to a file, say cos_table.c
- In your main program, use a #include "cos_table.c" to insert the table where you want.
With the magic of computers, the apparently impossible becomes possible:
#include <stdio.h>
#include <math.h>
#define MAX_ANGLE 90
double kinopiko_krazy_kosines[MAX_ANGLE];
int main ()
{
int i;
for (i = 0; i <= 90; i++) {
double angle = (M_PI * i) / (2.0*90.0);
kinopiko_krazy_kosines[i] = cos (angle);
printf ("#define cos_%d %f\n", i, kinopiko_krazy_kosines[i]);
}
}
I'd create a hard-coded lookup table - once with a scripting language - but I'm not sure it'll be faster than just using the standard math library.
I guess it depends on the size of the table, but I would suspect getting the FPU to do the calculation might be faster than accessing memory. So once you've got your table solution, I'd benchmark it to see if it's faster than the standard function.
I am not convinced that precalculating a sine table would result in a performance improvement. I suggest:
- Benchmark your application calling fcos() to decide whether it's fast enough. If it is, stop here.
- If it really is too slow, consider using -ffast-math if it is acceptable for your usage.
Lookup tables, particularly large ones, will increase the size of your program that needs to be held in the CPU cache, which reduces its hit rate. This in turn will slow other parts of your application down.
I am assuming you're doing this in an incredibly tight loop, as that's the only case it could possibly matter in anyway.
If you actually DID discover that using a lookup table was beneficial, why not just precalculate it at runtime? It's going to have hardly any impact on startup time (unless it's a huuuuuge one). It may actually be faster to do it at runtime, because your CPU may be able to do sines faster than your disc can load floats in.
Since you're targetting Cell, you're probably targetting the SPE's? They do have proper FP support, vectorised in fact, but do not have large working memories. For that reason it's in fact a BAD IDEA to use tables - you're sacrificing a very limited resource.
Wave tables are the way to go. You can hard code it as suggested, or run it during application start up.