tags:

views:

426

answers:

1

Dear all, I am writing a python program that is retrieving edifact log messages from a .gz file... An example of 2 logs are the following:

2009/03/02 12:13:59.642396 siamp102 mux1-30706 Trace name: MSG
Message sent [con=251575 (APEOBEinMux1), len=2106, CorrID=000182C42DE0ED]
UNB+IATB:1+1ASRPFA+1A0APE+090302:1213+0095JQOL2

2009/03/02 12:14:00.029496 siamp102 mux1-30706 Trace name: MSG
Message sent [con=737 (APIV2_1), len=22370, CorrID=000182C42DE0ED]
UNB+IATB:1+1ASIFQLFS+1ARIOFS+090302:1214+0122V11ON9

I would like to write a regular expression able to match some field from the first line, some from the second and some other from the third ...

Is there any way to write a regular expression to be used with GREP that matches field from consecutive lines ??

Thanks in advance !!!

+1  A: 

With grep alone, I think this is not possible. I would suggest awk or perl in order to be able to save some context from previous lines.

In perl this gives something like:

#!/usr/bin/env perl

$isInLogSection = 'NO';
while (<>) {
    if ( /siamp102/ ) {
        # Start of log section: retrieve its ID
        $isInLogSection = 'YES';
        split;
        $logSectionID = $_[0];
    }

    if ($isInLogSection eq YES && /len=/) {
        # Retrieve value of len
        ...
    }

    if ( /^$/ ) {
        # End of log section
        $isInLogSection = 'NO';
    }
}

In awk this gives something like:

BEGIN { isInLogSection = "NO"; }
/siamp102/ { isInLogSection = "YES"; logSectionID = $1; }
/len=/ { if (isInLogSection == "YES") { #retrieve len value } }
/^$/ { isInLogSection = "NO" }

I am not 100% certain of the exact syntax. This is mainly a canvas for illustrating the principles.

mouviciel
Can u please suggest me the syntax for the awk command able to select some element form first, second and third line ? Thanks
IceMan85
Oops sorry, when you added your comment I was editing my answer with a perl example...
mouviciel
awk version added.
mouviciel