views:

226

answers:

8

The problem is that I have n command-line arguments. There are always going to be at least 2, however the maximum number is unbounded. The first argument specifies a mode of operation and the second is a file to process. The 3rd through nth are the things to do to the file (which might be none, since the user might just want to clean the file, which is done if you just pass it 2 arguments).

I'm looking at the methods available to me in Perl for working with arrays, but I'm not sure what the "Perlish" way of iterating from item 3 to the end of my array is.

Some options that I've seen:

  • Pop from the end of the array until I find an element that does not begin with "-" (since the file path does not begin with a "-", although I suppose it could, which might cause problems).
  • Shift the array twice to remove the first two elements. Whatever I'm left with I can just iterate over, if its size is at least 1.

I like the second option, but I don't know if it's Perlish. And since I'm trying to learn Perl, I might as well learn the right way to do things in Perl.

+13  A: 

IMHO, the Perlish way of accomplishing what you need would be to use one of the Getopt modules on CPAN.

If you still want to do it by hand, I would go for the second option (this is similar to how we handle the first argument of a method call):

die "Must provide filename and operation\n" unless @ARGV >= 2;

my $op = shift @ARGV;
my $file = shift @ARGV;

if ( @ARGV ) {
    # handle the other arguments;
}
Sinan Ünür
@FM Thanks for the correction.
Sinan Ünür
+18  A: 

Aside from using Getopt module as Sinan wrote, I would probably go with:

my ( $operation, $file, @things ) = @ARGV;

And then you can:

for my $thing_to_do ( @things ) {
...
}
depesz
+1 this is indeed the answer to the question in the title.
Sinan Ünür
While this is technically a solution, Getopt::Long is a standard perl module that gives a fast, clean, and robust way of handling command line arguments. I would highly, highly recommend looking at it (and my example, below.)
Robert P
To do something as trivial as this, I think it's probably overkill to use another module. This appears to do what I need.
Thomas Owens
Ok, if that's what you want. For anyone else, though, when it comes to core modules like Getopt::Long, there's very little reason to ever hesitate to use them - module reuse is about as perlish as you can get. Core modules have been thoroughly tested, optimized and vetted throughout the years and are there to make common tasks, like @ARGV parsing, as trivial and problem free as possible.
Robert P
+7  A: 

The most standard way of doing things in Perl is through CPAN.

So my first choice would be Getopt::Long. There is also a tutorial on DevShed: Processing Command Line Options with Perl

gvlx
Getopt::Long has always been a core module (though there may be a newer version on CPAN at any given time).
ysth
+6  A: 

You can use a slice to extract the 2nd. to last items, for example:

[dsm@localhost:~]$ perl -le 'print join ", ", @ARGV[2..$#ARGV];' 1 2 3 4 5 6 7 8 9 10 00
3, 4, 5, 6, 7, 8, 9, 10, 00
[dsm@localhost:~]$

however, you should probably be using shift (or even better, GetOpt::Long)

dsm
+9  A: 

I would highly recommend using Getopt::Long for parsing command line arguments. It's a standard module, it works awesome, and makes exactly what you're trying to do a breeze.

use strict;
use warnings;
use Getopt::Long;

my $first_option = undef;
my $second_option = undef;

GetOptions ('first-option=s' => \$first_option, 
            'second-option=s' => \$second_option);

die "Didn't pass in first-option, must be xxxyyyzzz."
    if ! defined $first_option;
die "Didn't pass in second-option, must be aaabbbccc."
    if ! defined $second_option;

foreach my $arg (@ARGV) {
    ...
}

This lets you have a long option name, and automatically fills in the information into variables for you, and allows you to test it. It even lets you add extra commands later, without having to do any extra parsing of the arguments, like adding a 'version' or a 'help' option:

# adding these to the above example...
my $VERSION = '1.000';
sub print_help { ... }

# ...and replacing the previous GetOptions with this...
GetOptions ('first-option=s' => \$first_option, 
            'second-option=s' => \$second_option)
            'version' => sub { print "Running version $VERSION"; exit 1 },
            'help' => sub { print_help(); exit 2 } );

Then, you can invoke it on the command line using -, --, the first letter, or the entire option, and GetOptions figures it all out for you. It makes your program more robust and easier to figure out; it's more "guessable" you could say. The best part is you never have to change your code that processes @ARGV, because GetOptions will take care of all that setup for you.

Robert P
+3  A: 

deepesz answer is one good way to go.

There is also nothing wrong with your second option:

my $op     = shift; # implicit shift from @ARGV
my $file   = shift; 
my @things = @ARGV;

# iterate over @things;

You could also skip copying @ARGV into @things and work directly on it. However, unless the script is very short, very simple, and unlikely to grow more complex over time, I would avoid taking too many short cuts.

Whether you choose deepesz' approach or this one is largely a matter of taste.

Deciding which is better is really a matter of philosophy. The crux of the issue is whether you should modify globals like @ARGV. Some would say it is no big deal as long as it is done in a highly visible way. Others would argue in favor of leaving @ARGV untouched.

Pay no attention to anyone arguing in favor of one option or the other due to speed or memory issues. The @ARGV array is limited by most shells to a very small size and thus no significant optimization is available by using one method over the other.

Getopt::Long, as has been mentioned is an excellent choice, too.

daotoad
+3  A: 

Do have a look at MooseX::Getopt because it may whet your appetite for even more things Moosey!.

Example of MooseX::Getopt:

# getopt.pl

{
    package MyOptions;
    use Moose;
    with 'MooseX::Getopt';

    has oper   => ( is => 'rw', isa => 'Int', documentation => 'op doc stuff' );
    has file   => ( is => 'rw', isa => 'Str', documentation => 'about file' );
    has things => ( is => 'rw', isa => 'ArrayRef', default => sub {[]} );

    no Moose;
}

my $app = MyOptions->new_with_options;

for my $thing (@{ $app->things }) {
    print $app->file, " : ", $thing, "\n";
}

# => file.txt : item1
# => file.txt : item2
# => file.txt : item3

Will produce the above when run like so:

perl getopt.pl --oper 1 --file file.txt --things item1 --things item2 --things item3


These Moose types are checked... ./getopt --oper "not a number" produces:

Value "not a number" invalid for option oper (number expected)
And for free you always get a usage list ;-)
usage: getopt.pl [long options...]
         --file         bit about file
         --oper         op doc stuff
         --things    

/I3az/

draegtun
Why do you `no Moose;` right when Moose goes out of scope? Or is Moose not scoped?
Chris Lutz
Moose is scoped but it does import "has", "with" and other sugar into "MyOptions" namespace. "no Moose;" removes this sugar so that your cannot inadvertently do something like $app->has ;-)
draegtun
A: 

For the more general case with any array:

for(my $i=2; $i<@array; $i++) {
    print "$array[$i]\n";
}

That loops through the array, starting with the third element (index 2). Obviously, the specific example you specifiy, depesz's answer is the most straightforward and best.

Michael Cramer