tags:

views:

276

answers:

3

Hi,

I'm running a while loop reading each line in a file, and then fork processes with the data of the line to a child. After N lines I want to wait for the child processes to end and continue with the next N lines, etc.

It looks something like this:

    while ($w=<INP>) {

        # ignore file header
        if ($w=~m/^\D/) { next;}

        # get data from line
        chomp $w;
        @ws = split(/\s/,$w);

        $m = int($ws[0]);
        $d = int($ws[1]);
        $h = int($ws[2]);

        # only for some days in the year
        if (($m==3)and($d==15) or ($m==4)and($d==21) or ($m==7)and($d==18)) {

                die "could not fork" unless defined (my $pid = fork);

                unless ($pid) {

                        some instructions here using $m, $d, $h ...

                }
                push @qpid,$pid;

                # when all processors are busy, wait for child processes
                if ($#qpid==($procs-1)) {
                        for my $pid (@qpid) {
                                waitpid $pid, 0;
                        }
                        reset 'q';
                }
        }
}

close INP;

This is not working. After the first round of processes I get some PID equal to 0, the @qpid array gets mixed up, and the file starts to get read at (apparently) random places, jumping back and forth. The end result is that most lines in the file get read two or three times. Any ideas?

Thanks a lot in advance,

S.

+4  A: 

Are you exiting inside unless ($pid)?

If not, then your child, after running the command, will add $pid of zero to the array and generally continue running what is supposed to be parent process code

DVK
No, I wasn't!That solved the problemThanks a lot,S.
Sag
A: 

I am concerned that your algorithm is not terribly efficient:

Let the base process fork processes 1 to N.

If processes 2 through N complete, before process 1, then no new processes will be started until process 1 completes.

Instead of trying to get the fiddly details of your implementation correct, use Parallel::ForkManager to get working code easily.

use strict;
use warnings;
use Parallel::ForkManager;

my $pm = Parallel::ForkManager->new($MAX_PROCESSES);

while( my $w=<INP> ) {

    next if $w=~m/^\D/;        # ignore file header

    chomp $w;

    next unless match_dates( $w,
        { m => 3, d => 15 }, 
        { m => 7, d => 18 },
        { y => 2008       },  # Added this to show match_dates() capability.
    );

    my $pid = $pm->start and next; 

        .. Do stuff in child here ..

    $pm->finish;  # Terminate child
}

close INP;

# Returns true if ANY of the supplied date templates matches for ALL keys defined in that template.
sub match_dates {
    my $string = shift;

    my %target;
    @target{qw( m d y )} = split(/\s/,$string);

    DATE:
    for my $date ( @_ ) {

        my @k = keys %$match;
        my $count = 0;

        for( @k ) {
            next DATE unless $date{$_} == $target{$_};
            $count++;
        }

        return 1 if $count == @k;  # All keys match

    }

    return;
} 
daotoad
A: 

Thanks, this is great. I'm only started with fork and didn't know about the manager. match_dates is also very good, I couldn't think of doing it this way. Thanks again.
S.