views:

62

answers:

2

We have a qthreads-based workflow engine where worker threads pick up bundles of input as they are placed on a queue, then place their output on another queue for other worker threads to run the next stage; and so on until all the input has been consumed and all the output has been generated.

Typically, several threads will be running the same task and others will be running other tasks at the same time. We want to benchmark performance of these threaded tasks in order to target optimization efforts.

It's easy to get the real (elapsed) time that a given thread, running a given task, has taken. We just look at the difference between the return values of the POSIX times() function at the start and end of the thread's run() procedure. However, I cannot figure out how to get the corresponding user and system time. Getting these from the struct tms that you pass to times() doesn't work, because this structure gives total user and system times of all threads running while the thread in question is active.

+1  A: 

Assuming this is on Linux how about getrusage() with RUSAGE_THREAD? Solaris also offers RUSAGE_LWP which is similar and I guess there's probably equivalents for other POSIX-like systems.

Crude example:

#define _GNU_SOURCE
#include <sys/time.h>
#include <sys/resource.h>
#include <stdio.h>
#include <pthread.h>
#include <assert.h>
#include <unistd.h>

struct tinfo {
  pthread_t thread;     
  int id;
  struct rusage start;
  struct rusage end;
};

static void *
thread_start(void *arg)
{
  struct tinfo *inf = arg;
  getrusage(RUSAGE_THREAD, &inf->start);
  if (inf->id) {
     sleep(10);
  }
  else {
     const time_t start = time(NULL);
     while (time(NULL) - start < 10); // Waste CPU time!
  }
  getrusage(RUSAGE_THREAD, &inf->end);
  return 0;
}

int main() {
  static const int nrthr = 2;
  struct tinfo status[nrthr];
  for (int i = 0; i < nrthr; ++i) {
     status[i].id = i;
     const int s = pthread_create(&status[i].thread, 
                                            NULL, &thread_start, 
                                            &status[i]);
     assert(!s);
  }

  for (int i = 0; i < nrthr; ++i) {
     const int s = pthread_join(status[i].thread, NULL);
     assert(!s);
     // Sub-second timing is available too
     printf("Thread %d done: %ld (s) user, %ld (s) system\n", status[i].id, 
              status[i].end.ru_utime.tv_sec - status[i].start.ru_utime.tv_sec, 
              status[i].end.ru_stime.tv_sec - status[i].start.ru_stime.tv_sec);
  }  
}

I think something similar is possible on windows using GetProcessTimes()

awoodland
A: 

This looks good. I won't get to try it right away, but will respond when I do. (Re. getrusage, that is.)

Peter Shenkin
Welcome to Stack Overflow. You should use comments to reply to answers, because answers are ordered by votes.
ephemient