tags:

views:

91

answers:

3

Bash scripting does my head in. I have searched for regex assignment, but not really finding answers I understand.

I have files in a directory. I need to loop through the files and check if they fit certain criteria. File names under a certain sequence need to have their sequence increased. Those over a certain sequence need to trigger an alert.

I have pseudo code and need help turning it into correct bash syntax:

#!/bin/sh

function check_file()
{
    # example file name "LOG_20101031144515_001.csv"
    filename=$1

    # attempt to get the sequence (ex. 001) part of file

    # if sequence is greater than 003, then raise alert

    # else change file name to next sequence (ex. LOG_20101031144515_002.csv)
}

for i in `ls -Ar`; do check_file $i; done;

If PHP were an option, I could do the following:

function check_file($file){
    //example file 'LOG_20101031144515_001.csv';

    $parts = explode('.',$file);
    preg_match('/\d{3}$/', $parts[0], $matches);
    if ($matches){
        $sequence = $matches[0];
        $seq      = intval($sequence);

        if ($seq > 3){
            // do some code to fire alert (ex. email webmaster)
        }
        else{
            // rename file with new sequence
            $new_seq  = sprintf('%03d', ++$seq);
            $new_file = str_replace("_$sequence", "_$new_seq", $file);
            rename($file, $new_file);
        }
    }
}

So long story short, I'm hoping someone can help port the PHP check_file function to the bash equivalent.

Thank you

+2  A: 

PHP IS an option. If you master PHP, you can run it from shell. Run

php myfile.php

and get the output right into console. If the PHP file is executable and begins with

#!/path/to/php/executable

then you can run

./myfile.php

I'm no big expert in bash programming, but in order to obtain the list of files that match a certain patter you can use the command

ls -l | grep "pattern_unquoted"

I suggest you to go for the PHP ;-)

djechelon
Thanks for the suggestion. However, I won't be running the script from command line. It will live in a cronned shell script.
Jordan
+1: If Jordan can program PHP and not Bash, then PHP is probably the way forward... particularly if Jordan is the one who is going to have to maintain the script.
Johnsyweb
@Jordan: If you can run from the command-line, you can run from Cron.
Johnsyweb
+3  A: 

A different take on the problem:

#!/bin/sh

YOUR_MAX_SEQ=3

find /path/to/files -maxdepth 1 -name 'LOG_*.csv' -print \
  | sed -e 's/\.csv$//' \
  | awk -F_ '$3 > SEQ { print }' SEQ=$YOUR_MAX_SEQ

Brief explanation:

  • Find all files in /path/to/files matching LOG_*.csv
  • Chop the .csv off the end of each line
  • Using _ as a separator, print lines where the third field is greater than $YOUR_MAX_SEQ

This will leave you with a list of the files that met your criteria. Optionally, you could pipe the output through sed to stick the .csv back on.

If you're comfortable with PHP, you'd probably be comfortable with Perl, too.

Blrfl
Hi. Thanks for this. However it does not seem to handle the part about incrementing the sequence on files where current sequence is less than 3 and then saving file name with new incremented file name
Jordan
Replace the `print` in the `awk` with `printf "%s_%s_%03d.csv\n", $1, $2, $3+1 }' SEQ=$YOUR_MAX_SEQ`.
Blrfl
Oops... The `SEQ=$YOUR_MAX_SEQ` doesn't belong in my last comment.
Blrfl
A: 

First of all, your question is tagged [bash], but your shebang is #!/bin/sh. I'm going to assume Bash.

#!/bin/bash
function check_file()
{
    # example file name "LOG_20101031144515_001.csv"
    filename=$1

    # attempt to get the sequence (ex. 001) part of file

    seq=${filename%.csv}
    seq=${seq##*_}

    # if sequence is greater than 003, then raise alert

    if (( 10#$seq > 3 ))
    then
        echo "Alert!"
    else
        # else change file name to next sequence (ex. LOG_20101031144515_002.csv)
        printf -v newseq "%03d" $((seq + 1))
        echo "${filename/$seq/$newseq}" # or you could set a variable or do a mv
    fi
}
Dennis Williamson
@Dennis. Thank you for the feedback; it looks very close to what I am trying to do. However, I copied the code verbatim into a test file and it's throwing the following error about the function's final closing curly bracket: syntax error near unexpected token `}'
Jordan
@Dennis. Again, thank you. I got beyond curly bracket syntax issue; the script was missing "fi" to close the if block. Also, the script was missing the bit to increment the seq; I added that as "seq=$(( seq + 1 ))". Now I am stuck on the substitution in the last echo. In the line, echo "${filename%%$seq.csv}$newseq.csv", $seq is not being replaced by $newseq. Rather the "$newseq.csv" is being appended and the resulting output is LOG_20101031144515_001.csv002.csv rather than LOG_20101031144515_002.csv
Jordan
Success! After some hunting, I found the bash substitution pattern and replaced the echo line in Dennis' code, with: echo "${filename/$seq/$newseq}
Jordan
@Jordan: Sorry about the omissions and errors. I've edited my answer.
Dennis Williamson