views:

529

answers:

7

I am implementing a CLI tool using Perl. What are the best-practices we can follow here?

+2  A: 

You should use Perl modules to make your code reusable and easy to understand.
should have a look at Perl best practices

Neeraj
+14  A: 

Use POD to document your tool, follow the guidelines of manpages; include at least the following sections: NAME, SYNOPSIS, DESCRIPTION, AUTHOR. Once you have proper POD you can generate a man page with pod2man, view the documentation at the console with perldoc your-script.pl.

Use a module that handles command line options for you. I really like using Getopt::Long in conjunction with Pod::Usage this way invoking --help will display a nice help message.

Make sure that your scripts returns a proper exit value if it was successful or not.

Here's a small skeleton of a script that does all of these:

#!/usr/bin/perl

=head1 NAME

simplee - simple program

=head1 SYNOPSIS

    simple [OPTION]... FILE...

    -v, --verbose  use verbose mode
    --help         print this help message

Where I<FILE> is a file name.

Examples:

    simple /etc/passwd /dev/null

=head1 DESCRIPTION

This is as simple program.

=head1 AUTHOR

Me.

=cut

use strict;
use warnings;

use Getopt::Long qw(:config auto_help);
use Pod::Usage;

exit main();

sub main {

    # Argument parsing
    my $verbose;
    GetOptions(
     'verbose'  => \$verbose,
    ) or pod2usage(1);
    pod2usage(1) unless @ARGV;
    my (@files) = @ARGV;

    foreach my $file (@files) {
     if (-e $file) {
      printf "File $file exists\n" if $verbose;
     }
     else {
      print "File $file doesn't exist\n";
     }
    }

    return 0;
}
potyl
I don't think all of this hand-wringing over exit values is a good idea. Perl provides an implicit exit(0) in the absence of an exit or die. If you die when something is wrong and do nothing special otherwise, you will get a well-behaved program. I think this skeleton would be improved by ditching main entirely. Perl is not C.
oylenshpeegul
The reason for the main function, which makes it look a lot like a C program, is that without it all variables that you would declare in your program would become global lexically scoped variables. With the "main" function all inner variables are scoped to that single function.
potyl
Uh, who cares? There are no other scopes anyway.(But anyway, this is a lot of code to write to do nothing. Why not use MooseX::Getopt with MooseX::Runnable? Then this entire script is not even necessary!)
jrockway
I think that anyone using "strict" should care about the scope of their variables.Here's an example why some might like insert all code in a fuction.Imagine that the whole program is not isolating its variables in a lexical scope and that a new function is added. If a variable in the new function is not to declared a variable with "my" and that it happens to exist in the parent scope this will cause a lot of headaches.In this situation even "use strict" can't help. The reason is because the program has all of it's variables to become global lexical variables in the file.
potyl
"If a variable in the new function is not to declared a variable with 'my'" - If your functions don't localize their variables, you're already doing something wrong, and writing a `main` subroutine won't help you, it'll only hide the fact that your code is wrong.
Chris Lutz
Sometimes one can forget to write my in front of a variable (because of refactoring, etc). It doesn't mean that the code is flawed. In such situations Perl will find a variable in an outer block and use it. Like it or not all variables in a normal perl script that are not in block are global variables in that file. You can ignore this and pretend that having lower case names will tell perl not to use as global variables but it won't. I use a main block for that purpose.
potyl
+6  A: 

Some lessons I've learned:

1) Always use Getopt::Long

2) Provide help on usage via --help, ideally with examples of common scenarios. It helps people don't know or have forgotten how to use the tool. (I.e., you in six months).

3) Unless it's pretty obvious to the user as why, don't go for long period (>5s) without output to the user. Something like 'print "Row $row...\n" unless ($row % 1000)' goes a long way.

4) For long running operations, allow the user to recover if possible. It really sucks to get through 500k of a million, die, and start over again.

5) Separate the logic of what you're doing into modules and leave the actual .pl script as barebones as possible; parsing options, display help, invoking basic methods, etc. You're inevitably going to find something you want to reuse, and this makes it a heck of a lot easier.

Bill
for 3), **always** provide the user the *-q* option to be quiet
Steve Schnepp
+4  A: 

The following points aren't specific to Perl but I've found many Perl CL scripts to be deficient in these areas:

  1. Use common command line options. To show the version number implement -v or --version not --ver. For recursive processing -r (or perhaps -R although in my Gnu/Linux experience -r is more common) not --rec. People will use your script if they can remember the parameters. It's easy to learn a new command if you can remember "it works like grep" or some other familiar utility.

  2. Many command line tools process "things" (files or directories) within the "current directory". While this can be convenient make sure you also add command line options for explicitly identifying the files or directories to process. This makes it easier to put your utility in a pipeline without developers having to issue a bunch of cd commands and remember which directory they're in.

benrifkah
+3  A: 

There are a couple of modules on CPAN that will make writing CLI programs a lot easier:


If you app is Moose based also have a look at MooseX::Getopt and MooseX::Runnable

/I3az/

draegtun
+1. Any other way is a lot of redundant work.
jrockway
does this has the tab-completion feature for commmand execution??
Anandan
@Anandan: These modules help you build Perl apps than run from the shell/CLI. Your comment seem to be pointing to a different question to what everyone here to date have answered. If you're question is "how to emulate CLI" then I'd recommend asking a new question and/or looking on CPAN for modules like IO::Prompt. Hope that helps?
draegtun
+3  A: 

The most important thing is to have standard options.

Don't try to be clever, be simply consistent with already existing tools.

How to achieve this is also important, but only comes second.

Actually, this is quite generic to all CLI interfaces.

Steve Schnepp
+6  A: 

As a preface, I spent 3 years engineriung and implementing a pretty complicated command line toolset in Perl for a major financial company. The ideas below are basically part of our team's design guidelines.

User Interface

  1. Command line option: allow as many as possible have default values.

  2. NO positional parameters for any command that has more than 2 options.

  3. Have readable options names. If length of command line is a concern for non-interactive calling (e.g. some un-named legacy shells have short limits on command lines), provide short aliases - GetOpt::Long allows that easily.

  4. At the very least, print all options' default values in '-help' message.

    Better yet, print all the options' "current" values (e.g. if a parameter and a value are supplied along with "-help", the help message will print parameter's value from command line). That way, people can assemble command line string for complicated command and verify it by appending "-help", before actually running.

  5. Follow Unix standard convention of exiting with non-zero return code iff program terminated with errors.

  6. If your program may produce useful (e.g. worth capturing/grepping/whatnot) output, make sure any error/diagnostic messages go to STDERR so they are easily separable.

  7. Ideally, allow the user to specify input/output files via command line parameter, instead of forcing "<" / ">" redirects - this allows MUCH simpler life to people who need to build complicated pipes using your command. Ditto for error messages - have logfile option.

  8. If a command has side effect, having a "whatif/no_post" option is usually a Very Good Idea.

Implementation

  1. As noted previously, don't re-invent the wheel. Use standard command line parameter handling modules - MooseX::Getopt, or Getopt::Long

  2. For Getopt::Long, assign all the parameters to a single hash as opposed to individual variables. Many useful patterns include passing that CLI args hash to object constructors.

  3. Make sure your error messages are clear and informative... E.g. include "$!" in any IO-related error messages. It's worth expending extra 1 minute and 2 lines in your code to have a separate "file not found" vs. "file not readable" errors, as opposed to spending 30 minutes in production emergency because a non-readable file error was misdiagnosed by Production Operations as "No input file" - this is a real life example.

  4. Not really CLI-specific, but validate all parameters, ideally right after getting them. CLI doesn't allow for a "front-end" validation like webapps do, so be super extra vigilant.

  5. As discussed above, modularize business logic. Among other reasons already listed, the amount of times I had to re-implement an existing CLI tool as a web app is vast - and not that difficult if the logic is already a properly designed perm module.

Interesting links

CLI Design Patterns - I think this is ESR's

I will try to add more bullets as I recall them.

DVK
+1 for the link :-)
Steve Schnepp