views:

118

answers:

4

Hi All,

I have a bash script that checks some log files created by a cron job that have time stamps in the filename (down to the second). It uses the following code:

CRON_LOG=$(ls -1 $LOGS_DIR/fetch_cron_{true,false}_$CRON_DATE*.log 2> /dev/null | sed 's/^[^0-9][^0-9]*\([0-9][0-9]*\).*/\1 &/' | sort -n | cut -d ' ' -f2- | tail -1 )
if [ -f "$CRON_LOG" ]; then
    printf "Checking $CRON_LOG for errors\n"
else
        printf "\n${txtred}Error: cron log for $CRON_NOW does not exist.${txtrst}\n"
        printf "Either the specified date is too old for the log to still be around or there is a problem.\n"
        exit 1
fi
CRIT_ERRS=$(cat $CRON_LOG | grep "ERROR" | grep -v "Duplicate tracking code")
if [ -z "$CRIT_ERRS" ]; then
        printf "%74s[${txtgrn}PASS${txtrst}]\n"
else
        printf "%74s[${txtred}FAIL${txtrst}]\n"
        printf "Critical errors detected! Outputting to console...\n"
        echo $CRIT_ERRS
fi

So this bit of code works fine, but I'm trying to clean up my scripts now and implement set -e at the top of all of them. When i do it to this script, it exits with error code 1. Note that I have errors form the first statement dumping to /dev/null. This is because some days the file has the word "true" and other days "false" in it. Anyway, i don't think this is my problem because the script outputs "Checking xxxxx.log for errors." before exiting when I add set -e to the top.

Note: the $CRON_DATE variable is derived form user input. I can run the exact same statement from command line "$./checkcron.sh 01/06/2010" and it works fine without the set -e statement at the top of the script.

UPDATE: I added "set -x" to my script and narrowed the problem down. The last bit of output is:

Checking /map/etl/tektronix/logs/fetch_cron_false_010710054501.log for errors
++ cat /map/etl/tektronix/logs/fetch_cron_false_010710054501.log
++ grep ERROR
++ grep -v 'Duplicate tracking code'
+ CRIT_ERRS=

[1]+  Exit 1                  ./checkLoad.sh...

So it looks like the problem is occurring on this line:

CRIT_ERRS=$(cat $CRON_LOG | grep "ERROR" | grep -v "Duplicate tracking code")

Any help is appreciated. :)

Thanks, Ryan

+3  A: 

Adding set -x, which prints a trace of the script's execution, may help you diagnose the source of the error.

Edit:

Your grep is returning an exit code of 1 since it's not finding the "ERROR" string.

Edit 2:

My apologies regarding the colon. I didn't test it.

However, the following works (I tested this one before spouting off) and avoids calling the external cat. Because you're setting a variable using the results of a subshell and set -e looks at the subshell as a whole, you can do this:

CRIT_ERRS=$(cat $CRON_LOG | grep "ERROR" | grep -v "Duplicate tracking code"; true)
Dennis Williamson
Thanks Dennis, this helped she more light on the problem. I'm updating my question...
SDGuero
Extending this... `grep` returns 0 if it finds any matches, and 1 if it finds none.
ephemient
Don't get into the habit of using `; true`: it happens to work here, but it generally doesn't help with `set -e`. Instead, it is more traditional to use `|| :` or `|| true`.
ephemient
Haha, thanks! I trie the colon first, worked great, but I noticed that ephemient has about 17,000 more points that Dennis so i'm gonna go ahead and use the double pipes... :)
SDGuero
Don't use `||:` or `true` because an I/O error (caused by resource exhaustion, for example) won't be detected. If you use `|cat`, it will. If the extra process really bothers you, rewrite the whole line using awk instead of two greps and two cats.
geocar
+2  A: 

Asking for set -e makes the script exit as soon as a simple command exits with a non-zero exit status. This combines perniciously with your ls command, which exits with a non-zero status when asked to list a non-existent file, which is always the case for you because the true and false variants don't co-exist.

Hyman Rosen
Thanks Hyman. I figured it was the ls command at first but it continues to execute the printf statement which occurs after. I also changed that statement so it does not produce an error (I took out teh true part and tested on a day that I knew was false) and the script still exits.
SDGuero
Just to update on this. Since I'm redirecting errors from that ls statement to /dev/null, I don't believe set picks up on them.
SDGuero
I don't believe it's the `ls` -- redirection has nothing to do with it, most shells traditionally take the exit status of the last item of a pipeline, and `ls` is the first.
ephemient
+2  A: 
bash -c 'f=`false`; echo $?'
1
bash -c 'f=`true`; echo $?'
0
bash -e -c 'f=`false`; echo $?'
bash -e -c 'f=`true`; echo $?'
0

Note that backticks (and $()) "return" the error code of the last command they run. Solution:

CRIT_ERRS=$(cat $CRON_LOG | grep "ERROR" | grep -v "Duplicate tracking code" | cat)
geocar
I would use the no-op `:` (colon) here instead of unnecessarily calling an external command: `CRIT_ERRS=$(cat $CRON_LOG | grep "ERROR" | grep -v "Duplicate tracking code" | :)`, but +1 anyway for the cleverness!
Dennis Williamson
OOPS, never mind, the colon eats stdin and doesn't pass it through!
Dennis Williamson
You could use `|| :` instead of `| :` to avoid eating stdin; or just use `cat` as geocar suggested
Brian Campbell
The traditional method for ignoring failures under `-e` is using `|| :`, i.e. `CRIT_ERRS=$(... || :)`. @Dennis: don't pipe through `:` and it's fine.
ephemient
@ephemient: thanks, I added another way to my answer using `; true` since it's inside a subshell.
Dennis Williamson
Using `||:` or `;true` will hide errors, which means if there's a resource allocation problem, the result will be an empty string, and the program will continue. Using `|cat` causes `set -e` to still error out in this situation. Better to not use `set -e`, but better still to rewrite this line using `awk`.
geocar
+1  A: 

Redirecting error messages to /dev/null does nothing about the exit status returned by the script. The reason your ls command isn't causing the error is because it's part of a pipeline, and the exit status of the pipeline is the return value of the last command in it (unless pipefail is enabled).

Given your update, it looks like the command that's failing is the last grep in the pipeline. grep only returns 0 if it finds a match; otherwise it returns 1, and if it encounters an error, it returns 2. This is a danger of set -e; things can fail even when you don't expect them to, because commands like grep return non-zero status even if there hasn't been an error. It also fails to exit on errors earlier in a pipeline, and so may miss some error.

The solutions given by geocar or ephemient (piping through cat or using || : to ensure that the last command in the pipe returns successfully) should help you get around this, if you really want to use set -e.

Brian Campbell
Sorry the colon eats stdin.
Dennis Williamson
Not if you do it right.
ephemient
Thanks Brian, your answer was very helpful. :) I didn't know that set only processed error form the last command in a pipline.
SDGuero
Using `||:` or `;true` is not a good idea because this hides I/O errors. If there's some reason (resources, etc) that the errors could not be written to output, `cat` will still fail, causing set -e to exit out. Better advice: don't use `set -e`
geocar