views:

176

answers:

3

Right now i am loading a file then using gettimeofday and tracking the CPU time with tv_usec

My results varies, i get 250's to 280s but sometimes 300's or 500's. I wrote usleep and sleep (0) and (1) with no success. The time still varies vastly. I thought sleep(1) (seconds in linux, not the windows Sleep in ms) would have solved it. How can i keep track of time in a more consistent way for testing? Maybe i should wait until i have a much larger test data and more complex code before starting measurements?

+2  A: 

Are you trying to measure how long it takes to load a file? Usually if you're performance testing some bit of code that is already pretty fast (sub-second), then you will want to repeat the same code a number of times (say a thousand or a million), time the whole lot, then divide the total time by the number of iterations.

Having said that, I'm not quite sure what you're using sleep() for. Can you post an example of what you intend to do?

Greg Hewgill
+1  A: 

I would recommend putting that code in a for loop. Run it over 1000 or 10000 iterations. There's problems with this if you're doing only a few instructions, but it should help.

Larger data sets also help of course.

sleep is going to deschedule your thread from the cpu. It does not accurately count time with precision.

sharth
+3  A: 

The currently recommended interface for high-rez time on Linux (and POSIX in general) is clock_gettime. See the man page.


clock_gettime(CLOCK_REALTIME, struct timespec *tp) //  for wall-clock time
clock_gettime(CLOCK_PROCESS_CPUTIME_ID, struct timespec *tp) //  for CPU time

But read the man page. Note that you need to link with -lrt, because POSIX says so, I guess. Maybe to avoid symbol conflicts in -lc, for old programs that defined their own clock_gettime? But dynamic libs use weak symbols...

The best sleep function is nanosleep. It doesn't mess around with signals or any crap like usleep. It is defined to just sleep, and not have any other side effects. And it tells you if you woke up early (e.g. from signals), so you don't necessarily have to call another time function.

Anyway, you're going to have a hard time testing one rep of something that short that involves a system call. There's a huge amount of opportunity for variation. e.g. the scheduler may decide that some other work needs doing (unlikely if your process just started; you won't have used up your timeslice yet). CPU cache (L2 and TLB) are easily possible.

If you have a multi-core machine and a single-threaded benchmark for the code you're optimizing, you can give it realtime priority pinned to one of your cores. Make sure you choose the core that isn't handling interrupts, or your keyboard (and everything else) will be locked out until it's done. Use taskset (for pinning to one CPU) and chrt (for setting realtime prio). See this mail I sent to gmp-devel with this trick: http://gmplib.org/list-archives/gmp-devel/2008-March/000789.html

Oh yeah, for the most precise timing, you can use rdtsc yourself (on x86/amd64). If you don't have any other syscalls in what you're benching, it's not a bad idea. Grab a benchmarking framework to put your function into. GMP has a pretty decent one. It's maybe not set up well for benchmarking functions that aren't in GMP and called mpn_whatever, though. I don't remember, and it's worth a look.

Peter Cordes