views:

1826

answers:

3

what would be your suggestions for a good bash/ksh script template to use as a standard for all newly created scripts?

I usually start (after #! line) with a commented-out header with a filename, synopsis, usage, return values, author(s), changelog and would fit into 80-char lines.

all documentation lines I start with double-hash symbols : "##", so I can grep for them easily and local var names are prepended with "__".

any other best practices? tips? naming conventions? what about return codes?

thanks

edit : comments on version control : we got svn all right, but another dept in the enterprise has a separate repo and this is their script - how do I know who to contact with Q's if there is no @author info? javadocs-similar entries have some merit even in the shell context, IMHO, but I might be wrong. thanks for responses

+1  A: 

I would suggest

#!/bin/ksh

and that's it. Heavyweight block comments for shell scripts? I get the willies.

Suggestions:

  1. Documentation should be data or code, not comments. At least a usage() function. Have a look at how ksh and the other AST tools document themselves with --man options on every command. (Can't link because the web site is down.)

  2. Declare local variables with typeset. That's what it's for. No need for nasty underscores.

Norman Ramsey
+4  A: 

I'd extend Norman's answer to 6 lines, and the last of those is blank:

#!/bin/ksh
#
# @(#)$Id$
#
# Purpose

The third line is a version control identification string - it is actually a hybrid with an SCCS marker '@(#)' that can be identified by the (SCCS) program what and an RCS version string which is expanded when the file is put under RCS, the default VCS I use for my private use. The RCS program ident picks up the expanded form of $Id$, which might look like $Id: mkscript.sh,v 2.3 2005/05/20 21:06:35 jleffler Exp $. The fifth line reminds me that the script should have a description of its purpose at the top; I replace the word with an actual description of the script (which is why there's no colon after it, for example).

After that, there is essentially nothing standard for a shell script. There are standard fragments that appear, but no standard fragment that appears in every script. (My discussion assumes that scripts are written in Bourne, Korn, or POSIX (Bash) shell notations. There's a whole separate discussion on why anyone putting a C Shell derivative after the #! sigil is living in sin.)

For example, this code appears in some shape or form whenever a script creates intermediate (temporary) files:

tmp=${TMPDIR:-/tmp}/prog.$$
trap "rm -f $tmp.?; exit 1" 0 1 2 3 13 15

...real work that creates temp files $tmp.1, $tmp.2, ...

rm -f $tmp.?
trap 0
exit 0

The first line chooses a temporary directory, defaulting to /tmp if the user did not specify an alternative ($TMPDIR is very widely recognized and is standardized by POSIX). It then creates a file name prefix including the process ID. This is not a security measure; it is a simple concurrency measure, preventing multiple instances of the script from trampling on each other's data. (For security, use non-predictable file names in a non-public directory.) The second line ensures that the 'rm' and 'exit' commands are executed if the shell receives any of the signals SIGHUP (1), SIGINT (2), SIGQUIT (3), SIGPIPE (13) or SIGTERM (15). The 'rm' command removes any intermediate files that match the template; the exit command ensures that the status is non-zero, indicating some sort of error. The 'trap' of 0 means that the code is also executed if the shell exits for any reason - it covers carelessness in the section marked 'real work'. The code at the end then removes any surviving temporary files, before lifting the trap on exit, and finally exits with a zero (success) status. Clearly, if you want to exit with another status, you may - just make sure you set it in a variable before running the rm and trap lines, and then use exit $exitval.

I usually use the following to remove the path and suffix from the script, so I can use $arg0 when reporting errors:

arg0=$(basename $0 .sh)

I often use a shell function to report errors:

error()
{
    echo "$arg0: $*" 1>&2
    exit 1
}

If there's only one or maybe two error exits, I don't bother with the function; if there are any more, I do because it simplifies the coding. I also create more or less elaborate functions called usage to give the summary of how to use the command - again, only if there's more than one place where it would be used.

Another fairly standard fragment is an option parsing loop, using the getopts shell built-in:

vflag=0
out=
file=
Dflag=
while getopts hvVf:o:D: flag
do
    case "$flag" in
    (h) help; exit 0;;
    (V) echo "$arg0: version $Revision$ ($Date$)"; exit 0;;
    (v) vflag=1;;
    (f) file="$OPTARG";;
    (o) out="$OPTARG';;
    (D) Dflag="$Dflag $OPTARG";;
    (*) usage;;
    esac
done
shift $(expr $OPTIND - 1)

or:

shift $(($OPTIND - 1))

The quotes around "$OPTARG" handle spaces in arguments. The Dflag is cumulative, but the notation used here loses track of spaces in arguments. There are (non-standard) ways to work around that problem, too.

The first shift notation works with any shell (or would do if I used back-ticks instead of '$(...)'. The second works in modern shells; there might even be an alternative with square brackets instead of parentheses, but this works so I've not bothered to work out what that is.

One final trick for now is that I often have both the GNU and a non-GNU version of programs around, and I want to be able to choose which I use. Many of my scripts, therefore, use variables such as:

: ${PERL:=perl}
: ${SED:=sed}

And then, when I need to invoke Perl or sed, the script uses $PERL or $SED. This helps me when something behaves differently - I can choose the operational version - or while developing the script (I can add extra debug-only options to the command without modifying the script).

Jonathan Leffler
+1  A: 

Any code that is going to be released in the wild should have the following short header:

# Script to turn lead into gold
# Copyright (C) 2009 Joe Q Hacker - All Rights Reserved
# Permission to copy and modify is granted under the foo license
# Last revised 1/1/2009

Keeping a change log going in code headers is a throwback from when version control systems were terribly inconvenient. A last modified date shows someone how old the script is.

If you are going to be relying on bashisms, use #!/bin/bash , not /bin/sh, as sh is the POSIX invocation of any shell. Even if /bin/sh points to bash, many features will be turned off if you run it via /bin/sh. Most Linux distributions will not take scripts that rely on bashisms, try to be portable.

To me, comments in shell scripts are sort of silly unless they read something like:

# I am not crazy, this really is the only way to do this

Shell scripting is so simple that (unless your writing a demonstration to teach someone how to do it) the code nearly always obviates itself.

Some shells don't like to be fed typed 'local' variables. I believe to this day Busybox (a common rescue shell) is one of them. Make GLOBALS_OBVIOUS instead, its much easier to read, especially when debugging via /bin/sh -x ./script.sh.

My personal preference is to let logic speak for itself and minimize work for the parser. For instance, many people might write:

if [ $i = 1 ]; then
    ... some code 
fi

Where I'd just:

[ $i = 1 ] && {
    ... some code
}

Likewise, someone might write:

if [ $i -ne 1 ]; then
   ... some code
fi

... where I'd:

[ $i = 1 ] || {
   ... some code 
}

The only time I use conventional if / then / else is if there's an else-if to throw in the mix.

A horribly insane example of very good portable shell code can be studied by just viewing the 'configure' script in most free software packages that use autoconf. I say insane because its 6300 lines of code that caters to every system known to man that has a UNIX like shell. You don't want that kind of bloat, but it is interesting to study some of the various portability hacks within.. such as being nice to those who might point /bin/sh to zsh :)

The only other advice I can give is watch your expansion in here-docs, i.e.

cat << EOF > foo.sh
   printf "%s was here" "$name"
EOF

... is going to expand $name, when you probably want to leave the variable in place. Solve this via:

  printf "%s was here" "\$name"

which will leave $name as a variable, instead of expanding it.

I also highly recommend learning how to use trap to catch signals .. and make use of those handlers as boilerplate code. Telling a running script to slow down with a simple SIGUSR1 is quite handy :)

Most new programs that I write (which are tool / command line oriented) start out as shell scripts, its a great way to prototype UNIX tools.

You might also like the SHC shell script compiler, check it out here.

Tim Post
If you don't want here docs expanded, use << 'EOF' to suppress expansion. Only use backslashes when you want some things expanded and some things not expanded.
Jonathan Leffler