views:

305

answers:

4

I'm currently writing a simple .sh script to parse an Exim log file for strings matching " o' ". Currently, when viewing output.txt, all that is there is a 0 printed on every line(606 lines). I'm guessing my logic is wrong, as awk does not throw any errors.

Here is my code(updated for concatenation and counter issues). Edit: I've adopted some new code from dmckee's answer that I'm now working with over the old code in favor of simplicity.

awk '/o'\''/ {
         line = "> ";
         for(i = 20; i <= 33; i++) {
           line = line " " $i;
         }
         print line;
    }' /var/log/exim/main.log > output.txt

Any ideas?

EDIT: For clarity's sake, I'm grepping for "o'" in email addresses, because ' is an illegal character in email addresses(and in our databases, appears only with o'-prefixed names).

EDIT 2: As per commentary request, here is a sanitized sample of some desired output:

[xxx.xxx.xxx.xxx] kathleen.o'[email protected] <kathleen.o'[email protected]> routing defer (-51): retry time not reached

[xxx.xxx.xxx.xxx] julie.o'[email protected] <julie.o'[email protected]> routing defer (-51): retry time not reached

[xxx.xxx.xxx.xxx] james.o'[email protected] <james.o'[email protected]> routing defer (-51): retry time not reached

[xxx.xxx.xxx.xxx] daniel_o'[email protected] <aniel_o'[email protected]> routing defer (-51): retry time not reached

The reason I'm starting at 20 in my loop is because everything before the 20th field is just standard log information that isn't needed for my purposes here. All I need is everything from the IP and beyond for this solution(the messages for each 550 error are different for each mail server in use out there. I'm compiling a list of common ones)

+3  A: 

+ means numerical addition in awk. If you want to concatenate, just place the constants and/or expressions separated with spaces.

So, this

line += " " + $i

should become

line = line " " $i

EDIT: Iff exim log files (I am more into Postfix :) are separated by a single space, isn't the following more simple:

grep -F o\' /var/log/exim/main.log | cut -d\  -f20-33 >output.txt

?

ΤΖΩΤΖΙΟΥ
+2  A: 

There is no real need for the grep here. Let awk select the matching lines for you (and fixing your concatenation bug as per ΤΖΩΤΖΙΟΥ):

awk '/o'\''/ {
             line = "> ";
             for(i = 20; i <= 33; i++) {
               line = line " " $i;
             }
             print line;
        }' /var/log/exim/main.log > output.txt

Of course, you end up needing some weird escaping if you do it at the promp like above. It is cleaner in a script...


Edit: On the first pass I missed the += problem...

Also assuming that the line you gave above is partial, as it has only 13ish fields (by default fields are white space delimited).

dmckee
I tried this, and after clearing the output.txt file, I'm still just getting a bunch of 0's.
junkforce
You're correct. See my last comment on the question for why. I've updated my code accordingly.
junkforce
Yes! It works. All I had to do was widen the for loop after adopting your code. Using straight awk on the file rather than piping it through grep and other things is alot simpler. Thank you.
junkforce
+1  A: 

"'" is not illegal in local parts. From RFC2821, section 4.1.2:

Local-part = Dot-string / Quoted-string

Dot-string = Atom *("." Atom)

Atom = 1*atext

2821 further references RFC2822 for non-locally-defined elements, so:

atext           =       ALPHA / DIGIT / ; Any character except controls,
                        "!" / "#" /     ;  SP, and specials.
                        "$" / "%" /     ;  Used for atoms
                        "&" / "'" /
                        "*" / "+" /
                        "-" / "/" /
                        "=" / "?" /
                        "^" / "_" /
                        "`" / "{" /
                        "|" / "}" /
                        "~"

In other words, "'" is a perfectly legal unquoted characted to have in an email localpart. Now, it may not be legal at your site, but that's not what you said.

Sorry for not staying directly on topic, but I wanted to correct your assertion.

jj33
+1  A: 

Off task, and simpler still: python.

import fileinput
for line in fileinput.input():
    if "'" in line:
        fields = line.split(' ')
        print "> ", ' '.join( fields[20:34] )
S.Lott