views:

2131

answers:

14

Hi!

I have the following problem: some processes, generated dynamically, have a tendency to eat 100% of CPU. I would like to limit all the process matching some criterion (e.g. process name) to a certain amount of CPU percentage.

The specific problem I'm trying to solve is harnessing folding@home worker processes. The best solution I could think of is a perl script that's executed periodically and uses the cpulimit utility to limit the processes (if you're interested in more details, check this blog post). It works, but it's a hack :/

Any ideas? I would like to leave the handling of processes to the OS :)

+2  A: 

I see at least two options:

  • Use "ulimit -t" in the shell that creates your process
  • Use "nice" at process creation or "renice" during runtime
MatthieuP
ulimit -t is not a good suggestion, this will kill the process when its *total* execution time has exceeded a certain threshold, definitely not desirable when you just want to throttle a process.
Robert Gamble
+7  A: 

Why limit the percentage of CPU when you can just adjust the priority of the process using renice? Using renice, and setting a low priority allows the process to still use 100% of the processor if it's available, but any other process with higher priority will get the process when it needs it, with almost no noticeable lag.

Kibbee
There are reasons to do this; for example he may want to conserve battery power. Even if CPU scaling is locked at the lowest amount, the CPU will still run hotter, using more power both directly and by way of increasing fan speeds. Also, some systems are not particularly well designed for cooling, and on these systems overheating can occur.
intuited
A: 

The nice command will probably help.

Jim Blizard
A: 

Guys, thanks for your suggestions, but it's not about priorities - I want to limit the CPU % even when there's plenty of CPU time available. The processes are already low priority, so they don't cause any performance issues.

I would just like to prevent the CPU from running on 100% for extended periods...

asparagus
Folding@home is geared to grab as much processing time as possible, so its natural for them to go @100 processing speed. You might want to look into alternative @home packages that can provide CPU controls.
David
If you're trying to limit power consumption, it might be good to add that to your post. If there's some other reason for wanting to limit CPU usage, it could be helpful to say what the reason is.
Mr Fooz
+1  A: 

I don't know the answer, but maybe you can look at the WINE implementation of SetInformationJobObject Windows API

Nemanja Trifunovic
A: 

You could scale down the CPU frequency. Then you don't have to worry about the individual processes. When you need more cpu's, scale the frequency back up.

A: 

PS + GREP + NICE

TravisO
+2  A: 

This can be done using setrlimit(2) (specifically by setting RLIMIT_CPU parameter).

psaccounts
RLIMIT_CPU doesn't control the maximum percentage of CPU that can be used; it controls the maximum number of seconds that the process can use. The process will receive SIGXCPU signals as it passes the soft limit and SIGKILL when it crosses the hard limit.
Hudson
+4  A: 

I don't remember and dont think there was something like this in the unix scheduler. You need a little program which controls the other process and does the following:

loop
    wait for some time tR
    send SIGSTOP to the process you want to be scheduled
    wait for some time tP
    send SIGCONT to the process.
loopEnd

the ratio tR/tP controls the cpu load.


Here is a little proof of concept. "busy" is the program which uses up your cpu time and which you want to be slowed-down by "slowDown":

> cat > busy.c:
    main() { while (1) {}; }

> cc -o busy busy.c
> busy &
> top

Tasks: 192 total,   3 running, 189 sleeping,   0 stopped,   0 zombie
Cpu(s): 76.9% us,  6.6% sy,  0.0% ni, 11.9% id,  4.5% wa,  0.0% hi,  0.0% si
Mem:   6139696k total,  6114488k used,    25208k free,   115760k buffers
Swap:  9765368k total,  1606096k used,  8159272k free,  2620712k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
26539 cg        25   0  2416  292  220 R 90.0  0.0   3:25.79 busy
...

> cat > slowDown
while true; do
 kill -s SIGSTOP $1
 sleep 0.1
 kill -s SIGCONT $1
 sleep 0.1
done

> chmod +x slowDown
> slowDown 26539 &
> top
Tasks: 200 total,   4 running, 192 sleeping,   4 stopped,   0 zombie
Cpu(s): 48.5% us, 19.4% sy,  0.0% ni, 20.2% id,  9.8% wa,  0.2% hi,  2.0% si
Mem:   6139696k total,  6115376k used,    24320k free,    96676k buffers
Swap:  9765368k total,  1606096k used,  8159272k free,  2639796k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
26539 cg        16   0  2416  292  220 T 49.7  0.0   6:00.98 busy
...


ok, that script needs some more work (for example, to care for being INTR-upted and let the controlled process continue in case it was stopped at that moment), but you get the point. I would also write that little script in C or similar and compute the cpu ratio from a comand line argument....

regards

blabla999
+1  A: 

Throwing some sleep calls in there should force the process off the CPU for a certain time. If you sleep 30 seconds once a minute, your process shouldn't average more than 50% CPU usage during that minute.

Adam Jaskiewicz
A: 

Thanks again for the suggestions, but we're still missing the point :)

The "slowDown" solution is essentially what the "cpulimit" utility does. I still have to take care about what processes to slow down, kill the "slowDown" process once the worker process is finished and start new ones for new worker processes. It's precisely what I did with the Perl script and a cron job.

The main problem is that I don't know beforehand what processes to limit. They are generated dynamically.

Maybe there's a way to limit all the processes of one user to a certain amount of CPU percentage? I already set up a user for executing the folding@home jobs, hoping that i could limit him with the /etc/security/limits.conf file. But the nearest I could get there is the total CPU time per user...

It would be cool if to have something that enables you to say: "The sum of all CPU % usage of this user's processes cannot exceed 50%". And then let the processes fight for that 50% of CPU regarding to their priorities...

asparagus
A: 

mhmh - I dont think, you get that out of the box (sounds like a reasonable addition, though). You probably need to filter some ps-like output and schedule the worker processes using the above mentioned SIGSTOP/SIGCONT scheme (if you write it in some higher level scripting language or bash, it should not be too difficult). But, as you said, thats more or less what cpulimit does - so you may want to grab its source and enhance... ;-)

Sorry, dont know more... regards

blabla999
A: 

I dont really see why you want to limit the CPU time... you should limit the total load on that machine, and the load is determined by IO operations mostly . Ex: if i create a while(1){} loop, it will get the total load to 1.0, but if this loop does some disk writes the load jumps to 2.0... 4.0. And that's what killing your machine, not the CPU usage. The CPU usage can be easily handled by nice/renice.

Anyways, you could make a script that does a 'kill -SIGSTOP PID' for a specific PID, when the load gets too high, and kill -SIGCONT when everything gets back to normal... The PID's can be determined by using the 'px aux' command, because i see that it displays the CPU usage, and you should be able to sort the list using that column. I think this the whole thing could be done in bash...

Quamis
A: 

I have a similar problem, and the other solutions presented in the thread don't address it at all. My solution works for me right now,but it is suboptimal, particularly for the cases where the process is owned by root. My workaround for now is to try very hard to make sure that I don't have any long-running processes owned by root (like have backup be done only as a user)

I just installed the hardware sensors applet for gnome, and set up alarms for high and low temperatures on the CPU, and then set up the following commands for each alarm:

low: mv /tmp/hogs.txt /tmp/hogs.txt.$$ && cat /tmp/hogs.txt.$$ | xargs -n1 kill -CONT

high: touch /tmp/hogs.txt && ps -eo pcpu,pid | sort -n -r | head -1 | gawk '{ print $2 }' >> /tmp/hogs.txt && xargs -n1 kill -STOP < /tmp/hogs.txt

The good news is that my computer no longer overheats and crashes. The downside is that terminal processes get disconnected from the terminal when they get stopped, and don't get reconnected when they get the CONT signal. The other thing is that if it was an interactive program that caused the overheating (like a certain web browser plugin!) then it will freeze in the middle of what I'm doing while it waits for the CPU to cool off. It would be nicer to have CPU scaling take care of this at a global level, but the problem is that I only have two selectable settings on my CPU and the slow setting isn't slow enough to prevent overheating.

Just to re-iterate here, this has nothing at all to do with process priority, re-nicing,and obviously nothing to do with stopping jobs that run for a long time. This has to do with preventing CPU utilization from staying at 100% for too long, because the hardware is unable to dissipate the heat quickly enough when running at full capacity (idle CPU generates less heat than a fully loaded CPU).

Some other obvious possibilities that might help are:

Lower the CPU speed overall in the BIOS

Replace the heatsink or re-apply the thermal gel to see if that helps

Clean the heatsink with some compressed air

Replace the CPU fan

[edit] Note: no more overheating at 100% CPU when I disable variable fan speed in the bios (asus p5q pro turbo). With the CPU fully loaded, each core tops out at 49 celcius.

yapa