views:

92

answers:

4

I'm fairly new to Perl and was wondering what the best practices regarding subroutines are with Perl. Can a subroutine be too big?

I'm working on a script right now, and it might need to call another script. Should I just integrate the old script into the new one in the form of a subroutine? I need to pass one argument to the script and need one return value.

I'm guessing I'd have to do some sort of black magic to get the output from the original script, so subroutine-ing it makes sense right?

+5  A: 

Avoiding "black magic" is always a good idea when writing code. You never want to jump through hoops and come up with an unintuitive hack to solve a problem, especially if that code needs to be supported later. It happens, admittedly, and we're all guilty of it. Circumstances can weigh heavily on "just getting the darn thing to work."

The point is, the best practice is always to make the code clean and understandable. Remember, and this is especially true with Perl code in my experience, any code you wrote yourself more than a few months ago may as well have been written by someone else. So even if you're the only one who needs to support it, do yourself a favor and make it easy to read.

Don't cling to broad sweeping ideas like "favor more files over larger files" or "favor smaller methods/subroutines over larger ones" etc. Those are good guidelines to be sure, but apply the spirit of the guideline rather than the letter of it. Keep the code clean, understandable, and maintainable. If that means the occasional large file or large method/subroutine, so be it. As long as it makes sense.

David
"any code you wrote yourself more than a few months ago may as well have been written by someone else" - sounds like you are learning fast; congratulations
ysth
Agreed; the time to get worried is when you look at code you wrote 3 months ago and *don't* think "wtf was I thinking?!" :)
Ether
I actually have cause tonight to look at some Perl code I wrote 11 years ago. This should be fun.
David
Thank you, the 'as long as it makes sense' certainly does make sense. I guess I was just worried about developing bad habits early on.
Bharat
A: 

Sometimes it makes sense to have a separate script, sometimes it doesn't. The "black magic" isn't that complicated.

#!/usr/bin/perl
# square.pl
use strict;
use warnings;
my $input = shift;
print $input ** 2;

#!/usr/bin/perl
# sum_of_squares.pl
use strict;
use warnings;
my ($from, $to) = @ARGV;
my $sum;
for my $num ( $from .. $to ) {
    $sum += `square.pl $num` // die "square.pl failed: $? $!";
}
print $sum, "\n";

Easier and better error reporting on failure is automatic with IPC::System::Simple:

#!/usr/bin/perl
# sum_of_squares.pl
use strict;
use warnings;
use IPC::System::Simple 'capture';
my ($from, $to) = @ARGV;
my $sum;
for my $num ( $from .. $to ) {
    $sum += capture( "square.pl $num" );
}
print $sum, "\n";
ysth
+4  A: 

A key design goal is separation of concerns. Ideally, each subroutine performs a single well-defined task. In this light, the main question revolves not around a subroutine's size but its focus. If your program requires multiple tasks, that implies multiple subroutines.

In more complex scenarios, you may end up with groups of subroutines that logically belong together. They can be organized into libraries or, even better, modules. If possible, you want to avoid a scenario where you end up with multiple scripts that need to communicate with each other, because the usual mechanism for one script to return data to another script is tedious: the first script writes to standard output and the second script must parse that output.

Several years ago I started work at a job requiring that I build a large number of command-line scripts (at least, that's how it turned out; in the beginning, it wasn't clear what we were building). I was quite inexperienced at the time and did not organize the code very well. In hindsight, I should have worked from the premise that I was writing modules rather than scripts. In other words, the real work would have been done by modules, and the scripts (the code executed by a user on the command line) would have remained very small front-ends to invoke the modules in various ways. This would have facilitated code reuse and all of that good stuff. Live and learn, right?

FM
I'm sort of in the same situation now with the loads of command line scripting, hopefully I should fare better than you did all those years ago! Thanks!
Bharat
+2  A: 

Another option that hasn't been mentioned yet for reusing the code in your scripts is to put common code in a module. If you put shared subroutines into a module or modules, you can keep your scripts short and focussed on what they do that is special, while isolating the common code in a easy to access and reuse form.

For example, here is a module with a few subroutines. Put this in a file called MyModule.pm:

package MyModule;

# Always do this:
use strict;  
use warnings;

use IO::Handle;  # For OOP filehandle stuff.

use Exporter qw(import); # This lets us export subroutines to other scripts.

# These may be exported.
our @EXPORT_OK = qw( gather_data_from_fh open_data_file );

# Automatically export everything allowed.
# Generally best to leave empty, but in some cases it makes 
# sense to export a small number of subroutines automatically.
our @EXPORT = @EXPORT_OK;

# Array of directories to search for files.
our @SEARCH_PATH;

# Parse the contents of a IO::Handle object and return structured data
sub gather_data_from_fh {
    my $fh = shift;

    my %data;

    while( my $line = $fh->readline );
        # Parse the line
        chomp $line;
        my ($key, @values) = split $line;
         $data{$key} = \@values;
    }

    return \%data;
}

# Search a list of directories for a file with a matching name.
# Open it and return a handle if found.
# Die otherwise
sub open_data_file {
     my $file_name = shift;

     for my $path ( @SEARCH_PATH, '.' ) {
         my $file_path = "$path/$file_name"; 

         next unless -e $file_path;

         open my $fh, '<', $file_path 
             or die "Error opening '$file_path' - $!\n"

         return $fh;
     }

     die "No matching file found in path\n";     
}

1; # Need to have trailing TRUE value at end of module.

Now in script A, we take a filename to search for and process and then print formatted output:

use strict;
use warnings;

use MyModule;

# Configure which directories to search
@MyModule::SEARCH_PATH = qw( /foo/foo/rah  /bar/bar/bar /eeenie/meenie/mynie/moe );

#get file name from args.
my $name = shift;

my $fh = open_data_file($name);
my $data = gather_data_from_fh($fh);

for my $key ( sort keys %$data ) {
    print "$key -> ", join ', ', @{$data->{$key}};
    print "\n"; 
}

Script B, searches for a file, parses it and then writes the parsed data structure into a YAML file.

use strict;
use warnings;

use MyModule;
use YAML qw( DumpFile );

# Configure which directories to search
@MyModule::SEARCH_PATH = qw( /da/da/da/dum /tutti/frutti/unruly /cheese/burger );


#get file names from args.
my $infile  = shift;
my $outfile = shift;

my $fh = open_data_file($infile);
my $data = gather_data_from_fh($fh);

DumpFile( $outfile, $data );    

Some related documentation:

  • perlmod - About Perl modules in general
  • perlmodstyle - Perl module style guide; this has very useful info.
  • perlnewmod - Starting a new module
  • Exporter - The module used to export functions in the sample code
  • use - the perlfunc article on use.

Some of these docs assume you will be sharing your code on CPAN. If you won't be publishing to CPAN, simply ignore the parts about signing up and uploading code.

Even if you aren't writing for CPAN, it is beneficial to use the standard tools and CPAN file structure for your module development. Following the standard allows you to use all of the tools CPAN authors use to simplify the development, testing and installation process.

I know that all this seems really complicated, but the standard tools make each step easy. Even adding unit tests to your module distribution is easy thanks to the great tools available. The payoff is huge, and well worth the time you will invest.

daotoad