Hi,
< backgound>
I'm at a point where I really need to optimize C++ code. I'm writing a library for molecular simulations and I need to add a new feature. I already tried to add this feature in the past, but I then used virtual functions called in nested loops. I had bad feelings about that and the first implementation proved that this was a bad idea. However this was OK for testing the concept.
< /background>
Now I need this feature to be as fast as possible (well without assembly code or GPU calculation, this still has to be C++ and more readable than less). Now I know a little bit more about templates and class policies (from Alexandrescu's excellent book) and I think that a compile-time code generation may be the solution.
However I need to test the design before doing the huge work of implementing it into the library. The question is about the best way to test the efficiency of this new feature.
Obviously I need to turn optimizations on because without this g++ (and probably other compilers as well) would keep some unnecessary operations in the object code. I also need to make a heavy use of the new feature in the benchmark because a delta of 1e-3 second can make the difference between a good and a bad design (this feature will be called million times in the real program).
The problem is that g++ is sometimes "too smart" while optimizing and can remove a whole loop if it consider that the result of a calculation is never used. I've already seen that once when looking at the output assembly code.
If I add some printing to stdout, the compiler will then be forced to do the calculation in the loop but I will probably mostly benchmark the iostream implementation.
So how can I do a correct benchmark of a little feature extracted from a library ? Related question: is it a correct approach to do this kind of in vitro tests on a small unit or do I need the whole context ?
Thanks for advices !
There seem to be several strategies, from compiler-specific options allowing fine tuning to more general solutions that should work with every compiler like volatile
or extern
.
I think I will try all of these. Thanks a lot for all your answers!