tags:

views:

42

answers:

1

I am trying to extract conversations from a Postfix log file based on the client that initiated them. This is the awk script that extracts the matching message IDs:

awk '/client.host.name/ && !(/timeout/||/disconnect/) { sub(":","",$6);print $6}' maillog

This is using a standard Postfix maillog as input (see below for sample data). What I think I'd like to do is a multi-pass search of the file using the results of the first search, but I'm not sure if this is the right approach. Something similar to:

awk '/client.host.name/ && !(/timeout/||/disconnect/) {sub(":","",$6);msgid=$6} $0 ~ msgid {print $0}' maillog

But, naturally, this doesn't work as expected. I'm assuming I need to do one of the following things:

  1. Pipe the output from the first awk into a second awk or grep (not sure how to use piped input as a regex).
  2. Assign the first result set to an array and use the array as a search set. Something like:
    awk '/app02/ && !(/timeout/ || /connect/) { sub(":","",$6);msgid[$6]=$6; } END { for(x in msgid) { print x; } }' maillog
    I'm not sure how I'd proceed inside the for loop though. Is there a way in awk to "rewind" the file and then grab all lines that match any element within an array?
  3. Scrap the whole deal and try it using Perl.

So, for the awk gurus... is there any way to accomplish what I'm looking for using awk?

Sample data:

Jul 19 05:07:57 relay postfix/smtpd[5462]: C48F6CE83FA: client=client.dom.lcl[1.2.3.4]

Jul 19 05:07:57 relay postfix/cleanup[54]: C48F6CE83FA: message-id=<[email protected]>

Jul 19 05:07:57 relay postfix/qmgr[12345]: C48F6CE83FA: from=, size=69261, nrcpt=6 (queue active)

Jul 19 05:08:04 relay postfix/smtp[54205]: C48F6CE83FA: to=, relay=in.example.org[12.23.34.5]:25, delay=0.7, delays=0.05/0/0.13/0.51, dsn=2.0.0, status=sent (250 ok: Message 200012345 accepted)

Jul 19 05:14:08 relay postfix/qmgr[12345]: C48F6CE83FA: removed`

+1  A: 

You can use an array. Something roughly like this:

awk '/client.host.name/ && !(/timeout/||/disconnect/) {sub(":","",$6);msgid[$6]=1} {if ($FIELD in msgid) print}' maillog

Where you'll have to substitute the field number which contains the data since I don't know it.

Edit: Moved a left brace.

Edit2:

Here's a version specific to your sample data:

awk '/client.dom.lcl/ && !(/timeout/||/disconnect/) {sub(":","",$6); msgid[$6] = 1} {if (gensub(":", "", 1, $6) in msgid) print}' sampledata

Edit2:

Here's a simplified version:

awk '{id = gensub(":", "", 1, $6)} /client.dom.lcl/ && !(/timeout/||/disconnect/) {msgid[id] = 1} {if (id in msgid) print}' sampledata
Dennis Williamson
Dennis, this throws a syntax error right at the if. Does there need to be some kind of separator between the end of the first {} block and the if?
Justin
@Justin: Oops, I forgot some braces. See my edit.
Dennis Williamson
Dennis, this works beautifully. Thank you.
Justin