I'd extend Norman's answer to 6 lines, and the last of those is blank:
#!/bin/ksh
#
# @(#)$Id$
#
# Purpose
The third line is a version control identification string - it is actually a hybrid with an SCCS marker '@(#)
' that can be identified by the (SCCS) program what
and an RCS version string which is expanded when the file is put under RCS, the default VCS I use for my private use. The RCS program ident
picks up the expanded form of $Id$
, which might look like $Id: mkscript.sh,v 2.3 2005/05/20 21:06:35 jleffler Exp $
. The fifth line reminds me that the script should have a description of its purpose at the top; I replace the word with an actual description of the script (which is why there's no colon after it, for example).
After that, there is essentially nothing standard for a shell script. There are standard fragments that appear, but no standard fragment that appears in every script. (My discussion assumes that scripts are written in Bourne, Korn, or POSIX (Bash) shell notations. There's a whole separate discussion on why anyone putting a C Shell derivative after the #!
sigil is living in sin.)
For example, this code appears in some shape or form whenever a script creates intermediate (temporary) files:
tmp=${TMPDIR:-/tmp}/prog.$$
trap "rm -f $tmp.?; exit 1" 0 1 2 3 13 15
...real work that creates temp files $tmp.1, $tmp.2, ...
rm -f $tmp.?
trap 0
exit 0
The first line chooses a temporary directory, defaulting to /tmp if the user did not specify an alternative ($TMPDIR is very widely recognized and is standardized by POSIX). It then creates a file name prefix including the process ID. This is not a security measure; it is a simple concurrency measure, preventing multiple instances of the script from trampling on each other's data. (For security, use non-predictable file names in a non-public directory.) The second line ensures that the 'rm
' and 'exit
' commands are executed if the shell receives any of the signals SIGHUP (1), SIGINT (2), SIGQUIT (3), SIGPIPE (13) or SIGTERM (15). The 'rm
' command removes any intermediate files that match the template; the exit
command ensures that the status is non-zero, indicating some sort of error. The 'trap
' of 0 means that the code is also executed if the shell exits for any reason - it covers carelessness in the section marked 'real work'. The code at the end then removes any surviving temporary files, before lifting the trap on exit, and finally exits with a zero (success) status. Clearly, if you want to exit with another status, you may - just make sure you set it in a variable before running the rm
and trap
lines, and then use exit $exitval
.
I usually use the following to remove the path and suffix from the script, so I can use $arg0
when reporting errors:
arg0=$(basename $0 .sh)
I often use a shell function to report errors:
error()
{
echo "$arg0: $*" 1>&2
exit 1
}
If there's only one or maybe two error exits, I don't bother with the function; if there are any more, I do because it simplifies the coding. I also create more or less elaborate functions called usage
to give the summary of how to use the command - again, only if there's more than one place where it would be used.
Another fairly standard fragment is an option parsing loop, using the getopts
shell built-in:
vflag=0
out=
file=
Dflag=
while getopts hvVf:o:D: flag
do
case "$flag" in
(h) help; exit 0;;
(V) echo "$arg0: version $Revision$ ($Date$)"; exit 0;;
(v) vflag=1;;
(f) file="$OPTARG";;
(o) out="$OPTARG';;
(D) Dflag="$Dflag $OPTARG";;
(*) usage;;
esac
done
shift $(expr $OPTIND - 1)
or:
shift $(($OPTIND - 1))
The quotes around "$OPTARG" handle spaces in arguments. The Dflag is cumulative, but the notation used here loses track of spaces in arguments. There are (non-standard) ways to work around that problem, too.
The first shift notation works with any shell (or would do if I used back-ticks instead of '$(...)
'. The second works in modern shells; there might even be an alternative with square brackets instead of parentheses, but this works so I've not bothered to work out what that is.
One final trick for now is that I often have both the GNU and a non-GNU version of programs around, and I want to be able to choose which I use. Many of my scripts, therefore, use variables such as:
: ${PERL:=perl}
: ${SED:=sed}
And then, when I need to invoke Perl or sed
, the script uses $PERL
or $SED
. This helps me when something behaves differently - I can choose the operational version - or while developing the script (I can add extra debug-only options to the command without modifying the script).