I want to perform a comparison of multiple implementations of basically the same algorithm, written in Java, C++ and Python, the latter executed using Pypy, Jython and CPython on a Mac OS X 10.6.4 Macbook Pro with normal (non-SSD) HDD.
It's a "decode a stream of data from a file" type of algorithm, where the relevant measurement is total execution time, and I want to prevent bias through e.g. OS an HDD caches, other programs running simultaneously, too large/small sample file etc. What do I need to pay attention to to create a fair comparison?