tags:

views:

573

answers:

4

I want to profile my execution in a non-standard way. Using gprof, Valgrind, Oprofile... for a given function, I only get the mean of its execution time. What I would like is to obtain the standard deviation of this execution time.

Example:

void a()
  sleep ( rand() % 10 + 10 )

void b()
  sleep ( rand() % 14 + 2 )

main
  for (1 .. 100)
    a()
    b()

With standard tools, a and b functions will have similar behaviour. Do you know any tool which could give me this result, with an automatic approach.

I already tested with TAU, but until now, it is not really relevant. I think there is a solution in this way, but I am not enough confident with TAU. If anyone is Tau expert, I try to keep all the function execution times, and do the math at the end. But I don't know how to specify it in Tau.

I want to profile C / C++ code, but if you have any lead in other programing language, I'm open.

A: 

I think Apple's Shark profiling tool can generate the mean for each function. Of course, that only helps you on a Mac.

nsanders
A: 

A profiling tool is not magic, and you can roll your own for whatever purpose in a few lines.

Something like this, perhaps:

// code profile.cpp : Defines the entry point for the console application.
//

#include "stdafx.h"

class cProfile
{
public:
    // construct profiler for a particular scope
    // call at begining of scope to be timed
    // pass unique name of scope
    cProfile( const char* name )
    {
     myName = string( name );
     QueryPerformanceCounter( (LARGE_INTEGER *)&myTimeStart );
    }
    // destructor - automatically called when scope ends
    ~cProfile();

    // constructor - produces report when called without parameters
    cProfile();


private:
    typedef accumulator_set<__int64, stats<tag::variance(lazy)> > acc_t;
    static map < string, acc_t > myMap;
    string myName;
    __int64 myTimeStart;
};
map < string, accumulator_set<__int64, stats<tag::variance(lazy)> > > cProfile::myMap;


cProfile::~cProfile()
{
    __int64 t=0;
    QueryPerformanceCounter( (LARGE_INTEGER *)&t );
    t -= myTimeStart;


    map < string, acc_t >::iterator p = myMap.find( myName );
    if( p == myMap.end() ) {
     // this is the first time this scope has run
     acc_t acc;
     pair<string,acc_t > pr(myName,acc);
     p = myMap.insert( pr ).first;
    }
    // add the time of running to the accumulator for this scope
    (p->second)( t );

}
// Generate profile report
cProfile::cProfile()
{
    __int64 f;
    QueryPerformanceFrequency( (LARGE_INTEGER *)&f );

    printf("%20s Calls\tMean (secs)\tStdDev\n","Scope");
    for( map < string, accumulator_set<__int64, stats<tag::variance(lazy)> > >::iterator p = myMap.begin();
     p != myMap.end(); p++ )
    {
     float av = mean(p->second) / f;
     float stdev = sqrt( ((double) variance(p->second))  ) / f;
     printf("%20s %d\t%f\t%f\n",p->first.c_str(),
      boost::accumulators::count(p->second), av, stdev);
    }
}
void a()
{
    cProfile profile("a"); 

    Sleep ( rand() % 10 + 10 );
}
void b()
{
    cProfile profile("b");

    Sleep ( rand() % 20 + 5 );
}


int _tmain(int argc, _TCHAR* argv[])
{
    for (int k=1;k<=100;k++) {
     a();
     b();
    }

    cProfile profile_report;

    return 0;
}

Which produces

       Scope Calls      Mean (secs)     StdDev
           a 100        0.014928        0.002827
           b 100        0.015254        0.005671
ravenspoint
A: 

Actually Oprofile can profile function from a call graph view, which means same callee routine with different caller will be profiled in different stats.

Try opreport command for the report.

arsane
+1  A: 

Maybe does not apply, as it's gcc-specific, but I had found this save me a couple of times at least. If you compile the code with "-finstrument-functions" flag, then every entry and exit point of a function in the module that was compiled with this flag, will be stubbed with the calls to instrument functions. All you have to do is to have an inline that would read some high-precision counter (e.g. rdtsc on x86, though see this discussion) and a large array of records: [ func_addr, is_enter, timer_value ] to which you would continually write in the instrument functions. Upon exit, dump this array to file and analyze offline.

Quite far from "automated" approach, that you probably were looking for - but hope this is of use. The sample below shows the behaviour of gcc if it is compiled with -finstrument-functions. If you do not include the flag, it will work "as normal".

#include <stdio.h>
#include <stdlib.h>

void __cyg_profile_func_enter(void *fn, void *call)
    __attribute__ ((no_instrument_function));
void __cyg_profile_func_exit(void *fn, void *call)
    __attribute__ ((no_instrument_function));

void __cyg_profile_func_enter(void *fn, void *call) {
  printf("Enter %x,%x\n",fn,call);
}
void __cyg_profile_func_exit(void *fn, void *call) {
  printf("Exit %x,%x\n",fn,call);
}

int foo(int i) {
  printf("inside foo\n");
}

int main(int argc, char *argv[]) {
  printf("inside main 1\n");
  foo(123);
  printf("inside main 2\n");
  exit(0);
}
Andrew Y