views:

46

answers:

2

Hi,

I'm using the linux /proc//stat file to generate cpu usage information for an application. The issue that I have run into is that on Fedora 13 things seem to act strangely whlie on ubuntu 10.04 things behave as I expect them to.

Specifically:
on fedora the application logs more process system time by a ratio of 3:1
on ubuntu the application logs more process user time by a ratio of 4:1
on fedora the process user time value stops incrementing after a short while and never continues.

This seems very strange to me and the fact that user time stops incrementing at all seems like an outright bug.

I have also tried reading the values in a couple of different ways all with the same result, and I have conducted a test to confirm that the user and system times aren't transposed.

Can anyone shed some light on what might be happening? Is there any valid way that process user time would stop incrementing for a process?

A: 

The user time not incrementing at all does sound like a bug. If you can create a minimal example that demonstrates the problem, I would submit it to the Fedora bug tracker.

(Are you doing a lot of work in signal handlers, by any chance?)

caf
negative on the signal handlers, i'll have a look at making a minimal example but my guess is that I won't be able to break it.
radman
A: 

Assuming you mean /proc/[pid]/stat, a process can accumulate no user time if it is spending all its time in syscall or waiting on wchan (usually disk or network or other I/O).

The level of detail of process accounting is controlled by a number of configuration variables in the Linux 2.6.x (and presumably other) kernels.

msw
The application has 20 running threads and continues to produce good output throughout, it is a large application and there is now way that it is sitting in system calls all the time...
radman
Your intuition about what is "no way" happening and your data seem to conflict. The data are usually more correct than your intuition. For example, deadlocked threads can keep a process from accumulating user space cycles.
msw
It's extraordinarily unlikely that it would be accumulating *zero* user time - even long-executing syscalls will return to userspace eventually (and of course processes blocked in the kernel won't accumulate either sort of time).
caf