tags:

views:

403

answers:

3

I'm setting something up to SSH out to several servers in 'batches'. I basically want to maintain 5 connections at a time, and when one finishes open up another (following an array of server IPs).

I'm wondering for something like this should I be using fork()? If so, what logic can I use to ensure that the I maintain 5 children at a time?

+10  A: 

Forking (or threading) is what you want, but you should look at CPAN for modules that will provide most of what you need to prevent you from reinventing the wheel and going through the learning pains of what you need to do.

For example, Parallel::ForkManager looks like it's EXACTLY what you want.

use Parallel::ForkManager;

$pm = new Parallel::ForkManager($MAX_PROCESSES);

foreach $data (@all_data) {
  # Forks and returns the pid for the child:
  my $pid = $pm->start and next; 

  ... do some work with $data in the child process ...

  $pm->finish; # Terminates the child process
}
kbenson
+3  A: 

There are several modules that solve exactly this problem. See Parallel::ForkManager, Forks::Super, or Proc::Queue, for example.

mobrule
A: 

My personal forking(!) favourite is Proc::Fork

General overview from pod:

use Proc::Fork;

run_fork {
    child {
        # child code goes here.
    }
    parent {
        my $child_pid = shift;
        # parent code goes here.
        waitpid $child_pid, 0;
    }
    retry {
        my $attempts = shift;
        # what to do if if fork() fails:
        # return true to try again, false to abort
        return if $attempts > 5;
        sleep 1, return 1;
    }
    error {
        # Error-handling code goes here
        # (fork() failed and the retry block returned false)
    }
};


And to limit the number of maximum processes running for something like SSH batches then this should do the trick:

use strict;
use warnings;
use 5.010;
use POSIX qw(:sys_wait_h);
use Proc::Fork;

my $max = 5;
my %pids;

my @ssh_files = (
    sub { system "scp file0001 baz@foo:/somedir/." },
    ...
    sub { system "scp file9999 baz@foo:/somedir/." },

);

while (my $proc = shift @ssh_files) {

    # max limit reached
    while ($max == keys %pids) {
        # loop thru pid list until a child is released
        for my $pid (keys %procs) {
            if (my $kid = waitpid($pid, WNOHANG)) {
                delete $pids{ $kid };
                last;
            }
        }
    }

    run_fork {
        parent {
            my $child = shift;
            $pids{ $child } = 1;
        }
        child {
            $proc->();
            exit;
        }
    }
}

/I3az/

draegtun
Does Proc::Fork throttle the number of background processes? How does this answer the OP?
mobrule
@mobrule: Sorry got called away! There's nothing specific (that I know off!) in Proc:Fork for throttling. So revert to normal waitpid measures (see my updated example).
draegtun