views:

252

answers:

1

I have a script that uses Parallel::ForkManager. However, the wait_all_children() process takes incredibly long time even after all child-processes are completed. The way I know is by printing out some timestamps (see below). Does anyone have any idea what might be causing this (I have 16 CPU cores on my machine)?

my $pm = Parallel::ForkManager->new(16);
for my $i (1..16) {
    $pm->start($i) and next;

    ... do something within the child-process ...

    print (scalar localtime), " Process $i completed.\n";
    $pm->finish();
}
print (scalar localtime), " Waiting for some child process to finish.\n"; 
$pm->wait_all_children();
print (scalar localtime), " All processes finished.\n"; 

Clearly, I'll get the Waiting for some child process to finish message first, with a timestamp of, say, 7:08:35. Then I'll get a list of Process i completed messages, with the last one at 7:10:30. However, I do not receive the message All Processes finished until 7:16:33(!). Why is that 6-minute delay between 7:10:30 and 7:16:33? Thx!

+5  A: 

I tried this:

#!/opt/perl/bin/perl

use strict; use warnings;

use Parallel::ForkManager;

my $pm = Parallel::ForkManager->new(16);

for my $i (1..16) {
    $pm->start($i) and next;
    sleep rand 20;
    printf "%s : Process %d completed\n", scalar localtime, $i;
    $pm->finish;
}

printf "%s: Waiting for some child to finish\n", scalar localtime;
$pm->wait_all_children;

printf "%s: All processes finished.\n", scalar localtime; 

I got:

[sinan@archardy Src]$ ./y.pl
Thu Mar 11 17:14:16 2010 : Process 3 completed
Thu Mar 11 17:14:16 2010: Waiting for some child to finish
Thu Mar 11 17:14:18 2010 : Process 8 completed
Thu Mar 11 17:14:18 2010 : Process 14 completed
<snip>...</snip>
Thu Mar 11 17:14:34 2010 : Process 12 completed
Thu Mar 11 17:14:34 2010: All processes finished.

I have perl 5.10.1 on Linux with Parallel::ForkManager version 0.7.5.

Therefore, I conclude that whatever issue you are having is happening as a consequence of what happens when you

# ... do something within the child-process ...

Update: The problem is, you are printing the Process finished message before the the finish call. Try the following version:

#!/opt/perl/bin/perl

use strict; use warnings;

use Parallel::ForkManager;

my $pm = Parallel::ForkManager->new(16);
$pm->run_on_finish( sub {
    printf "%s : Process completed: @_\n", scalar localtime
});

for my $i (1..16) {
    $pm->start($i) and next;
    sleep rand 20;
    $pm->finish;
}

printf "%s: Waiting for some child to finish\n", scalar localtime;
$pm->wait_all_children;

printf "%s: All processes finished.\n", scalar localtime;

See Callbacks in Parallel::ForkManager documentation for more information. If the delay disappears, then the symptom you are observing was due to the fact that you were claiming the forked process had finished before it was done.

Sinan Ünür
You are right, Sinan. I forgot to mention that this delay does not happen to me EVERY time. It only happens if each of my child-process takes long time and many system resources to finish. What bothers me, though, is that whatever happens in the children, after it prints the last "Process i completed", should no longer be relevant. But here is an actual output I got: 08:02:43 : Waiting for some children... 08:06:00 : group 1 is done. ... 08:06:12 : group 16 is done. 08:07:03 : All done.I'm wondering if I should explicitly release some memory / resource that's blocking the wait?
Zhang18
@Zhang18 See my updated answer using the `run_on_finish` callback.
Sinan Ünür
I see. So if I print timestamp via run_on_finish(), I get what you expect (i.e. there is no additional wait time due to wait_for_children). However, my question hence becomes why would the child process not considered "finished" by the fork even though literarily I've reached that print statement inside the loop? To be sure, I'm not doing anything fancy with the task inside the loop. It's just a bunch of arithmatic calculations plus some database queries and file I/O. The delay seems to be caused by the $pm->finish() method in failing to pick up the actual finish time of the child process.
Zhang18
@Zhang18 The child is **NOT** finished until `$pm->finish` returns. Please do read http://perldoc.perl.org/functions/exit.html especially the last paragraph.
Sinan Ünür

related questions