ansaurus

Question

Testing the performance of a C++ app

Answer 1

+4 A:

Execute the function a few thousand times to get an accurate measurement.

A single measurement might be dominated by OS events or other random noise.

S.Lott 2009-04-28 00:53:29

Answer 2

+2 A:

You can use the time() function to get a timer with a resolution of one second. If you need more resolution, you could use gettimeofday(). The resolution of that depends on your operating system and runtime library.

Greg Hewgill 2009-04-28 00:54:55

Answer 3

+3 A:

In Windows, you can use high performance counters to get more accurate results:

You can use the QueryPerformanceFrequency() function to get the number of high frequency ticks per seconde and the user the QueryPerformanceCounter() before and after the function you want to time.

Of course, this method is not portable...

MartinStettner 2009-04-28 01:01:26

Be careful with the HF counters. Multi-processor systems sometimes make their use...interesting. They work with processor-specific counters, so if you end up with your code on a different CPU, the counters can be off (depending on your exact hardware, of course).

Michael Kohne 2009-04-28 01:05:28

Setting thread affinity to a single CPU for the duration of the benchmark will eliminate SMP-related time warps. There also may be clock ramping due to power management, which would matter if the CPU becomes idle because the benchmarked code sleeps or blocks on I/O. For AMD systems, installing the AMD Processor Driver will improve QPC() synchronization considerably. Windows Vista and Windows 7 use the HPET timer (if available) instead of the TSC, so the TSC problems may eventually go away (when/if Windows XP goes away).

bk1e 2009-04-28 02:44:33

See http://support.microsoft.com/kb/895980 for resolutions.

peterchen 2009-04-28 09:53:59

Answer 4

+3 A:

Have you considered actually using a profiler? Visual Studio Team System has one built in, but there are others available like VTune and GlowCode.

See also http://stackoverflow.com/questions/67554/whats-the-best-free-c-profiler-for-windows-if-there-are

rlbond 2009-04-28 01:03:20

Gah! Do they have ones that don't cost so much?

Billy ONeal 2009-04-28 01:47:09

Yeah, take a look above.

rlbond 2009-04-28 02:48:03

Profilers are good for answering the question "what is the slowest part of my program?" but usually not so good at answering the question "how slow is my program?" accurately.

bk1e 2009-04-28 02:50:33

Answer 5

A:

You can simply use time() within your code to measure within accuracy of seconds. Thorough benchmarks should run many iterations for accuracy so seconds should be a large enough margin. If you are using linux, you can use the time utility as provided by the command line like so:

[john@awesome]$time ./loops.sh

real    0m3.026s
user    0m4.000s
sys     0m0.020s

John T 2009-04-28 01:07:29

Answer 6

A:

On unix systems (Linux, Mac, etc.) you can use time utility like so:

$ time ./my_app

fengshaun 2009-04-28 03:03:45

Answer 7

A:

If your function is very fast, a good practice is to time the function in a loop and then subtract the loop overhead.

Something like this:

int i;
int limit=1000000;
int t0=getTime();
for(i=0; i < limit; ++i)
   ;
int t1=getTime();
int loopoverhead = t1-t0;
t0=getTime();
for(i=0; i < limit; ++i)
    function();
t1=getTime();
double tfunction = (1.0/limit)*((t1-t0)-loopoverhead);

jfklein 2009-04-28 03:03:48

Answer 8

A:

I always use boost::timer or boost::progress_timer.

psudo-code:

#include <boost/timer.hpp>

boost::timer t;

func1();
cout << "fun1: " << t.elapsed();

t.restart();
func2();
cout << "func2(): " << t.elapsed();

t.g. 2009-04-28 09:40:26

Answer 9

+2 A:

Use the best counter available on your platform, fall back to time() for portability. I am using QueryPerformanceCounter, but see the comments in the other reply.

General advise:

The inner loop should run at least about 20 times the resolution of your clock, to make the resolution error < 5%. (so, when using time() your inner loop should run at least 20 seconds)

Repeat these measurements, to see if they are consistent.

I use an additional outer loop, running ten times, and ignoring the fastest and the slowest measurement for calculating average and deviation. Deviation comes handy when comparing two implementations: if you have one algorithm taking 2.0ms +/-.5, and the other 2.2 +/- .5, the difference is not significant to call one of them "faster". (max and min should still be displayed). So IMHO a valid performance measurement should look something like this:

10000 x 2.0 +/- 0.2 ms (min = 1.2, , max=12.6), 10 repetitions

If you know what you are doing, purging the cache and setting thread affinity can make your measurements much more robust.

However, this is not without pifalls. The more "stable" the measurement is, the less realistic it is as well. Any implementation will vary strongly with time, depending on the state of data and instruction cache. I'm lazy here, useing the max= value to judge first run penalty, this might not be sufficient for some scenarios.

peterchen 2009-04-28 10:09:48

Answer 10

+1 A:

If you want to check you performance, you should consider measuring used processor time, not the real time you are trying to measure now. Otherwise you might get quite inaccurate times if some other application running in background decides to do some heavy calculations at the same time. Functions you want would be GetProcessTimes on Windows and getrusage on Linux.

Also you should consider using profilers, as other people suggested.

n0rd 2009-04-29 08:08:09

Answer 11

A:

Run it 1000 times as 100 iterations * 10 iterations, where you unroll the inner loop to minimize overhead. Then seconds translate to milliseconds.

As others have pointed out, this is a good way to measure how long it takes.

However, if you also want to make it take less time, that is a different goal, and needs a different technique. My favorite is this.

Mike Dunlavey 2009-04-29 12:52:16

Answer 12

A:

What's wrong with clock() and CLOCKS_PER_SEC? They are standard C89.

Something like (nicked from MSDN):

   long i = 6000000L;
   clock_t start, finish;
   double  duration;

   // Measure the duration of an event.
   printf( "Time to do %ld empty loops is ", i );
   start = clock();
   while( i-- ) 
      ;
   finish = clock();
   duration = (double)(finish - start) / CLOCKS_PER_SEC;
   printf( "%2.1f seconds\n", duration );

20th Century Boy 2009-04-29 13:37:31

Answer 13

A:

I'm going to go ahead and answer my own question by saying that the link (http://stackoverflow.com/questions/275004/c-timer-function-to-provide-time-in-nano-seconds) in the comment posted by Andy White was what I was looking for.

Jeremy 2010-07-14 15:10:12

ansaurus

tags:

views:

answers:

Testing the performance of a C++ app

related questions