I'm currently exploring interfacing C and R (using R's .Call for speed). I've only used C for trivial integer computation and text processing applications, and I've never had to worry about issues with float variables, underflow, etc. What tests can I write for functions to ensure numerical accuracy?
views:
75answers:
3For a great overview of possible edge cases, see this Wikipedia article.
Then for unit tests see The pitfalls of verifying floating-point computations.
I guess:
- The accuracy will depend on the arguments - so you might test some values, only to discover later that there are other values where there is loss of precision.
- You can test values that differ by extremely small increments and see if the algorithm maintains precision.
- Use doubles rather than floats.
- Finally, you can do the calculations two times (increasing the arguments by the smallest delta for the second run?), taking the worst case choice for each operation, and see what range of answers you get.
To do such tests correctly, you will have to do error analysis of all the floating-point computations in your code. Canonical examples of why this is important range from simple things such as finding roots of a quadratic equation to solving crazy recurrence relations (search for "Muller's recurrence" for example).
But, doing this error analysis is usually non-trivial and time-consuming, and often impossible. In such cases, the second best situation would be to have a large set of inputs and correct output values, and use those in your tests. But this is mostly a chicken and egg problem, because how do you find the correct values for something even slightly complicated save by actually calculating them on some sort of computer?
So, practical solutions include:
- Interval arithmetic, where you use data structures to give you an estimate of an interval your floating-point calculations can lie in. If the interval of the result is too big, you know there are some issues. There are many implementations for doing such analysis, see this list for details. For large calculations, this method is going to be slow, but you can try to run it for some representative sample taken from your actual data to get an idea.
- If your compiler supports it, you can run the same code with different floating-point rounding modes. C99 has standardized this, and gcc supports it, so you might be in luck. Look at
fesetround()
, andfegetround()
for example. The idea is that if your floating-point computations are stable and accurate, different rounding modes should result in similar output. If different rounding modes result in numbers that are too different, you know that there is something wrong. This method, unfortunately, can't tell you what is wrong, but is a first step. Also, this method should be fast, other than the fact that you're going to run your code four times for one input set.