tags:

views:

988

answers:

3

Hi all,

I'm an aerospace engineering student, and I'm working on a senior capstone project. One of the mathematical models I'm developing requires an astronomical amount of generated data from XFOIL, a popular aerospace tool used to find the lift and drag coefficients on airfoils. (But I'm digressing.)

Cut to the chase: I have a Perl script that calls XFOIL repeatedly with different input parameters to generate the data I need. I need XFOIL to run 5600 times, and as it stands right now it takes about 100 seconds on average per run. Doing the math, this means it will take about 6.5 days to complete.

Now, I have a quad-core machine, but my experience as a programmer is limited, and I really only know how to use basic Perl. I would like to run 4 instances of XFOIL at a time, all on their own core. Something like this:

while (1){
    for (i = 1..4){
        if (! exists XFOIL_instance(i)){
            start_new_XFOIL_instance(i, input_parameter_list);
        }
    }
}

So the program is checking (or preferably sleeping until an XFOIL instance wakes it up to start a new instance) if every core is running XFOIL. If not, the previous instance exited and we can start a new instance with the new input parameter list.

If anyone has any idea how this can be achieved, please let me know. This would significantly speed up the time I need to generate data and will let me work on the aerospace project itself.

Thanks for the help!

+8  A: 

Try Parallel::ForkManager. It's a module that provides a simple interface for forking off processes like this.

Here's some example code:

#!/usr/bin/perl

use strict;
use warnings;
use Parallel::ForkManager;

my @input_parameter_list = 
    map { join '_', ('param', $_) }
    ( 1 .. 15 );

my $n_processes = 4;
my $pm = Parallel::ForkManager->new( $n_processes );
for my $i ( 1 .. $n_processes ) {
    $pm->start and next;

    my $count = 0;
    foreach my $param_set (@input_parameter_list) {   
     $count++;
     if ( ( $count % $i ) == 0 ) {
      if ( !output_exists($param_set) ) {
       start_new_XFOIL_instance($param_set);
      }
     }
    }

    $pm->finish;
}
$pm->wait_all_children;

sub output_exists {
    my $param_set = shift;
    return ( -f "$param_set.out" );
}

sub start_new_XFOIL_instance {
    my $param_set = shift;
    print "starting XFOIL instance with parameters $param_set!\n";
    sleep( 5 );
    touch( "$param_set.out" );
    print "finished run with parameters $param_set!\n";
}

sub touch {
    my $fn = shift;
    open FILE, ">$fn" or die $!;
    close FILE or die $!;
}

You'll need to supply your own implementations for the start_new_XFOIL_instance and the output_exists functions, and you'll also want to define your own sets of parameters to pass to XFOIL.

James Thompson
This looks to be what I need. I will read up on Parallel::ForkManager and let you know how it goes. Thanks for the help! Of course, any other input from anyone else is appreciated.
strictlyrude27
If you didn't already know, you can install the Parallel::ForkManager module in your home directory. Look here for how to do so:http://stackoverflow.com/questions/540640/how-can-i-install-a-cpan-module-into-a-local-directory
James Thompson
James, thanks very much for your help. I installed Parallel::ForkManager via command line a little bit ago - I think I'm up and running now. I'm still trying to figure out the intricacies of the module as well as how I want it to behave in error conditions, but a preliminary run on my dual-core laptop leads me to think I've figured this out - at least the basic idea, anyway. Thanks a bunch again!
strictlyrude27
+3  A: 

This looks like you can use gearman for this project.

www.gearman.org

Gearman is a job queue. You can split your work flow into a lot of mini parts.

I would recommend using amazon.com or even their auction able servers to complete this project.

Spending 10cents per computing hour or less, can significantly spead up your project.

I would use gearman locally, make sure you have a "perfect" run for 5-10 of your subjobs before handing it off to an amazon compute farm.

Daniel
+1  A: 

Perl's threading implementation is all done inside Perl, so it can't take advantage of multiple cores. fork can, but its a bit inconvenient to use, especially if you need children to talk to each other or share data.

The amazing forks.pm emulates Perl's thread interface using fork and sockets. It may be a better choice than Parallel::ForkManager if interchild communication is important.

Schwern