tags:

views:

826

answers:

6

I am doing a lot of file searching based on a criteria in multiple iterations from my Perl script and it seems to take up 100% of the CPU time. Is there a way to control my script CPU utilization? I read somewhere about putting empty sleep cycles in my script. But I am not sure how to do this.

+3  A: 

Just sleep:

while ($not_done_yet) {
    do_stuff_here();
    sleep 1; # <-- sleep for 1 second.
}

or slightly more fancy, do N operations per sleep cycle:

my $op_count = 0;
while ($not_done_yet) {
    do_stuff_here();

    $op_count ++;
    if ($op_count >= 100) {
        $op_count = 0;
        sleep 1; # <-- sleep for 1 second every 100 loops.
    }
}
slebetman
Ok. I get the idea. So I'll have to break up my tasks into groups that can be iterated.
Nilesh C
+8  A: 

you could lower the process (Perl's) priority, as assigned by the OS: windows or linux

example for lowest priority:

windows

start /LOW  perl <myscript>

linux

nice +19 perl <myscript>
Alon
This looks like the simplest solution. I'll try this before the others and report back.
Nilesh C
This is an easy solution but requires the users knowledge or that you create a startup script or an alias. If however, you use setpriority() the code will always renice/change priority as intended. The type of application and use should decide what solution to choose.
Christian Vik
+5  A: 

You could use sleep or usleep. Another thing you can do is to lower the process priority.

Update: See setpriority()

Christian Vik
Perl's `setpriority` function is just a wrapper around the system call with the same name. As such, it's a fatal error to call it on platforms that don't implement it (e.g. Windows).
Michael Carman
I just confirmed this. `setpriority()` is not implemented in Windows.
Nilesh C
+5  A: 

A great way to improve CPU utilization is to use better algorithms. Don't guess where your code is spending all its time: use a profiler. Devel::NYTProf is a fantastic tool for this.

Be sure to keep Amdahl's law in mind. For example, say part of your program uses a quadratic algorithm and with some effort you could replace it with a linear one. Hooray! But if the code in question accounts for only 5% of the total runtime, your most heroic effort can bring no better than a tiny five-percent improvement. Use a profiler to determine whether opportunities for greater speedup are available elsewhere.

You don't tell us what you're searching for, and even the best known algorithms can be CPU-intensive. Consider that your operating system's scheduler has been written, hand-tuned, tested, and rewritten to use system resources efficiently. Yes, some tasks require specialized schedulers, but such cases are rare—even less likely given that you're using Perl.

Don't take it as a bad sign that your code is eating up CPU. You may be surprised to learn that one of the hardest challenges in realtime systems, where performance is crucial, is to keep the CPU busy rather than idling.

Greg Bacon
My script spends a lot of time doing menial tasks, like searching text in files, running external programs, capturing and parsing their output, rather than any specialized algorithm. In general, I agree with your opinion. I am not particularly concerned that my script is eating up the CPU. But more with the fact that this CPU usage is affecting other programs and services running on the same Windows server. So that's it. A simple low process priority seems to serve my purpose at present. Yes, profiling is good in the long run.
Nilesh C
Sounds like a smart choice.
Greg Bacon
+2  A: 

sleep + time + times can kinda do this.

my $base = time;
my $ratio = 0.5;
my $used = 0;
sub relax {
    my $now = time;
    my ($total) = times;
    return if $now - $base < 10 or $total - $used < 5;  # otherwise too imprecise
    my $over = ($total - $used) - $ratio * ($now - $base);
    $base = $now + ($over > 0 && sleep($over));
    $used = $total;
}

(Untested...) Sprinkle enough relax calls throughout your code and this should average out to near or under 50% CPU time.


BSD::Resource can do this less invasively, and you might as well grab Time::HiRes for higher precision.

my $base = clock_gettime(CLOCK_MONOTONIC);
my (undef, $hard) = getrlimit(RLIMIT_CPU);
my $interval = 10;
if ($hard != RLIM_INFINITY && $hard < $interval) {$interval = $hard / 2}
my $ratio = 0.5;
$SIG{XCPU} = sub {
    setrlimit(RLIMIT_CPU, $interval, $hard);
    my $now = clock_gettime(CLOCK_MONOTONIC);
    my $over = $interval - $ratio * ($now - $base);
    $base = $now + ($over > 0 && sleep($over));
};
setrlimit(RLIMIT_CPU, $interval, RLIM_INFINITY);

(Also untested...) On a system which supports it, this should ask the OS to signal you every $interval seconds of CPU time, at which point you reset the counter and sleep. This should not require any changes to the rest of your code.

ephemient
Has to be tested. But looks like an interesting solution to generally control CPU usage with numbers.
Nilesh C
+1  A: 

Is your script actually doing things the whole time? For example, if you calculate a mandelbrot set, you'll have loops that consume CPU, but are actively processing data all the time.

Or do you have loops where you are waiting for more data to process:

while(1) { 
    process_data() if data_ready();
}

In the first case, setting priority is probably the best solution. It will slow computation, but only as much as needed to service any other processes on the system.

In the second case, you can improve CPU utilization drastically by sleeping for only a fraction of a second.

while(1) { 
    process_data() if data_ready();
    select( undef, undef, undef, 0.1 );
}

If you are pulling data from a source that select can operate on, so much the better. The you can arrange for your loop to block until data is ready.

use IO::Select;
my $s = IO::Select->new($handle);

while(1) { 
    process_data() if $s->can_read;
}

Select works on sockets and file-handles on *NIX systems. On Windows systems, you can only select against sockets.

daotoad