tags:

views:

379

answers:

1

This zgrep command is outputting a particular field of a line containing the word yellow when given a giant input log file for all 24 hours of 26th Feb 1989.

zgrep 'yellow' /color_logs/1989/02/26/*/1989-02-26-00_* | cut -f3 -d'+'

1) I prefer using a perl script. Are there advantages of using a bash script?

Also when writing this script I would like for it to create a file after processing the data for each DAY (so it will look at all the hours in a day)

zgrep 'yellow' /color_logs/1989/02/*/*/1989-02-26-00_* | cut -f3 -d'+'

2) How do I determine the value of the first star (in Perl), after processing a day's worth of data so that I can output the file with the YYMMDD in its name. I'm interested in getting the value of the first star from the line of code directly above this question.

+1  A: 

Grep writes out the file that where the line came from, but your cut command is throwing that away. You could do something like:

open(PROCESS, "zgrep 'yellow' /color_logs/1989/02/*/*/1989-02-26_* |");
while(<PROCESS>) {
    if (m!/color_logs/(\d\d\d\d)/(\d\d)/(\d\d)/[^:]+:(.+)$!) {
        my ($year, $month, $day, $data) = ($1, $2, $3, $4);
        # Do the cut -f3 -d'+' on the line from the log
        my $data = (split('+', $data))[2];
        open(OUTFILE, ">>${year}${month}${day}.log");
        print OUTFILE $data, "\n";
        close(OUTFILE);
    }
}

That's inefficient in that you're opening and closing the file for each line, you could use an IO::File object instead and only open when the date changes, but you get the idea.

Aaron