views:

84

answers:

1

Hi there! I have a log file which needs to be properly formatted into a readable format. However the text file has no static number of lines or fixed primary values and has random number of spaces but has only a log file header which can be used to pin point the start and end of each time the application logs.

An Example of the log file:

Log File header
<text>
<text>
Log File header
<text>

After the script has been formatted it should look something like this:

Log File header
<text>
<text>

<space>

Log File header
<text>
<text>

Therefore I need some advice on greping out an entire paragraph everytime the Perl Script detects a "Log File header".

Here is the grep perl script:

#!/usr/bin/perl

#use 5.010; # must be present to import the new 5.10 functions, notice 
#that it is 5.010 not 5.10

my $file = "/root/Desktop/Logfiles.log";
open LOG, $file or die "The file $file has the error of:\n =>  $!";

@lines = <LOG>;
close (LOG);

@array = grep(/Log File header/, @lines);

print @array;

Can someone please give some advice on the codes? Thanks.

A: 

So you just want vertical space in between your log file sections?

There are a few approaches, particularly because you know the header will be on a completely separate line. In all the following examples assume that @lines has already been populated from your input file.

So first technique: insert spaces before header:

foreach my $line ( @lines ) {
    if ( $line =~ m/Log File header/ ) {
        print( "\n\n\n" ); # or whatever you want <space> to be
    }

    print( $line );
}

The next technique is to use a regular expression to search/replace blocks of text:

my $space = "\n\n\n"; # or whatever you want <space> to be
my $everything = join( "", @lines );
$everything =~ s/(Log File header.*?)(?=Log File header)/$1$space/sg;
print( $everything );

Some explanation about the regexp. The (?= means "look-ahead" which will match but not form part of the expression to be replaced. The /sg modifiers mean s-treat newlines as ordinary whitespace and g-do a global search-and-replace. The .*? means select anything, but as little as possible to satisfy the expression (non-greedy), which is extremely important in this application.

update: edited first technique in which I'd failed to explicitly specify which variable to do the match upon.

PP
Thanks mate! The first technique only seems to print the entire log with no changes but the second technique works just fine! Thanks again mate!
JavaNoob
Oh my mistake, I will edit the first answer, I didn't explicitly specify which variable to match on. You should find the first one works now.
PP
The first code still doesn't work as it just reprints the entire log again. Mind if I ask why do you need the code "my $everything = join( "", @lines );"for the second working code? How does it contribute to the regular expression? Does it have something to do with the /s? Thanks again mate!
JavaNoob
Please look at `perldoc -f join` for documentation on the join function. If you're unsure of the difference between scalars and arrays it's time to learn the basics of Perl.
PP