Hi, I'm convinced that software testing indeed is very important especially in science. However, over the last 6 years I never have come across any scientific software project which was under regular tests (ok most of them where not even version controlled). Now I'm wondering how you deal with software tests for scientific codes (numerical computations). From my point of view standard unit tests often miss the point since there is no exact result, so using assert(a==b) might prove a bit difficult due to "normal" numerical errors. So I'm looking forward to read your thoughts about this.
Please take a look at the answers to the SO question How to use TDD correctly to implement a numerical method?
I am also in academia and I have written quantum mechanical simulation programs to be executed on our cluster. I made the same observation regarding testing or even version control. I was even worse: in my case I am using a C++ library for my simulations and the code I got from others was pure spaghetti code, no inheritance, not even functions.
I rewrote it and I also implemented some unit testing. You are correct that you have to deal with the numerical precision which can be different depending on the architecture you are running on. Nevertheless I unit testing is possible as long as you are taking these numerical rounding errors into account. Your result should not depend on the rounding of the numerical values, otherwise you would have a different problem with the robustness of your algorithm.
So to conclude, I use unit testing for my scientific programs and it really makes one more confident about the results, especially with regards to publishing the data in the end.
I'm also using cpptest for its TEST_ASSERT_DELTA
. I'm writing high-performance numerical programs in computational electromagnetics and I've been happily using it in my C++ programs.
I typically go about testing scientific code the same way as I do with any other kind of code, with only a few retouches, namely:
I always test my numerical codes for cases that make no physical sense and make sure the computation actually stops before producing a result. I learned this the hard way: I had a function that was computing some frequency responses, then supplied a matrix built with them to another function as arguments which eventually gave its answer a single vector. The matrix could have been any size depending on how many terminals the signal was applied to, but my function was not checking if the matrix size was consistent with the number of terminals (2 terminals should have meant a 2 x 2 x n matrix); however, the code itself was wrapped so as not to depend on that, it didn't care what size the matrices were since it just had to do some basic matrix operations on them. Eventually, the results were perfectly plausible, well within the expected range and, in fact, partially correct -- only half of the solution vector was garbled. It took me a while to figure. If your data looks correct, it's assembled in a valid data structure and the numerical values are good (e.g. no NaNs or negative number of particles) but it doesn't make physical sense, the function has to fail gracefully.
I always test the I/O routines even if they are just reading a bunch of comma-separated numbers from a test file. When you're writing code that does twisted math, it's always tempting to jump into debugging the part of the code that is so math-heavy that you need a caffeine jolt just to understand the symbols. Days later, you realize you are also adding the ASCII value of
\n
to your list of points.When testing for a mathematical relation, I always test it "by the book", and I also learned this by example. I've seen code that was supposed to compare two vectors but only checked for equality of elements and did not check for equality of length.