views:

187

answers:

3

How can I check how long a process spends waiting for the CPU in a Linux box?

For example, in a loaded system I want to check how long a SQL*Loader (sqlldr) process waits.

It would be useful if there is a command line tool to do this.

+1  A: 

I've quickly slapped this together. It prints out the smallest and largest "interferences" from task switching...

#include <sys/time.h>
#include <stdio.h>

double seconds()
{
    timeval t;
    gettimeofday(&t, NULL);
    return t.tv_sec + t.tv_usec / 1000000.0;
}

int main()
{
    double min = 999999999, max = 0;
    while (true)
    {
        double c = -(seconds() - seconds());
        if (c < min)
        {
            min = c;
            printf("%f\n", c);
            fflush(stdout);
        }
        if (c > max)
        {
            max = c;
            printf("%f\n", c);
            fflush(stdout);
        }
    }

    return 0;
}
Mike
What you're measuring is scheduling latency, not contention latency.
Michael Foukarakis
I missed the "I want to check how long a sqlldr process waits" part.My test shows what the longest time is it does not get to spent with the cpu.ps -elF shows total cpu time spent.I'm wondering if something more accurate is in /proc/<procid>/.One could take a snapshot, run a long query, and check it again.Amount not spent with the cpu is time passed minus time reported passed spent with cpu, something like that.
Mike
+1  A: 

Here's how you should go about measuring it. Have a number of processes, greater than the number of your processors * cores * threading capability wait (block) on an event that will wake them up all at the same time. One such event is a multicast network packet. Use an instrumentation library like PAPI (or one more suited to your needs) to measure the differences in real and virtual "wakeup" time between your processes. From several iterations of the experiment you can get an estimate of the CPU contention time for your processes. Obviously, it's not going to be at all accurate for multicore processors, but maybe it'll help you.

Cheers.

Michael Foukarakis
+1  A: 

I had this problem some time back. I ended up using getrusage : You can get detailed help at : http://www.opengroup.org/onlinepubs/009695399/functions/getrusage.html

getrusage populates the rusage struct.


Measuring Wait Time with getrusage

You can call getrusage at the beginning of your code and then again call it at the end, or at some appropriate point during execution. You have then initial_rusage and final_rusage. The user-time spent by your process is indicated by rusage->ru_utime.tv_sec and system-time spent by the process is indicated by rusage->ru_stime.tv_sec.

Thus the total user-time spent by the process will be: user_time = final_rusage.ru_utime.tv_sec - initial_rusage.ru_utime.tv_sec

The total system-time spent by the process will be: system_time = final_rusage.ru_stime.tv_sec - initial_rusage.ru_stime.tv_sec

If total_time is the time elapsed between the two calls of getrusage then the wait time will be wait_time = total_time - (user_time + system_time)

Hope this helps

Kisalay