tags:

views:

2146

answers:

5

Here at work we are working on a newsletter system that our clients can use. As an intern one of my jobs is to help with the smaller pieces of the puzzle. In this case what I need to do is scan the logs of the email server for bounced messages and add the emails and the reason the email bounced to a "bad email database".

The bad emails table has two columns: 'email' and 'reason' I use the following statement to get the information from the logs and send it to the Perl script

grep " 550 " /var/log/exim/main.log | awk '{print $5 "|" $23 " " $24 " " $25 " " $26 " " $27 " " $28 " " $29 " " $30 " " $31 " " $32 " " $33}' | perl /devl/bademails/getbademails.pl

If you have sugestions on a more efficient awk script, then I would be glad to hear those too but my main focus is the Perl script. The awk pipes "[email protected]|reason for bounce" to the Perl script. I want to take in these strings, split them at the | and put the two different parts into their respective columns in the database. Here's what I have:

#!usr/bin/perl                                                                                                                                                                               

use strict;
use warnings;
use DBI;

my $dbpath = "dbi:mysql:database=system;host=localhost:3306";
my $dbh = DBI->connect($dbpath, "root", "******")
    or die "Can't open database: $DBI::errstr";

while(<STDIN>) {
    my $line = $_;                                    
    my @list = # ?  this is where i am confused
    for (my($i) = 0; $i < 1; $i++)
    {
        if (defined($list[$i]))
        {
            my @val = split('|', $list[$i]);
            print "Email: $val[0]\n";
            print "Reason: $val[1]";
            my $sth = $dbh->prepare(qq{INSERT INTO bademails VALUES('$val[0]', '$val[1]')});
            $sth->execute();                                                                                                  
            $sth->finish();                                                                                                                                                                              
        }
    }
}
exit 0;
+6  A: 

I'm not sure what you want to put in @list? If the awk pipes one line per entry, you'll have that in $line, and you don't need the for loop on the @list.

That said, if you're going to pipe it into Perl, why bother with the grep and AWK in the first place?

#!/ust/bin/perl -w
use strict;

while (<>) {
  next unless / 550 /;
  my @tokens = split ' ', $_;
  my $addr = $tokens[4];
  my $reason = join " ", @tokens[5..$#tokens];

  # ... DBI code
}

Side note about the DBI calls: you should really use placeholders so that a "bad email" wouldn't be able to inject SQL into your database.

zigdon
+1 ... great minds think alike zigdon ;-)
toolkit
Look at the third argument to split as a way of simplifying this.
dland
+5  A: 

Why not forgo the grep and awk and go straight to Perl?

Disclaimer: I have not checked if the following code compiles:

while (<STDIN>) {
    next unless /550/; # skips over the rest of the while loop
    my @fields = split;
    my $email = $fields[4];
    my $reason = join(' ', @fields[22..32]);
    ...
}

EDIT: See @dland's comment for a further optimisation :-)

Hope this helps?

toolkit
You could split(/ /, $_, 22) in order to stop splitting after the 22nd space. That avoids having to slice the @fields afterwards.
dland
+7  A: 

Something like this would work:

while(<STDIN>) {
  my $line = $_;
  chomp($line);
  my ($email,$reason) = split(/\|/, $line);
  print "Email: $email\n";
  print "Reason: $reason";
  my $sth = $dbh->prepare(qq{INSERT INTO bademails VALUES(?, ?)});
  $sth->execute($email, $reason);                                                                                                  
  $sth->finish();                                                                                                                                                                              
}

You might find it easier to just do the whole thing in Perl. "next unless / 550 /" could replace the grep and a regex could probably replace the awk.

Glomek
+3  A: 
my(@list) = split /\|/, $line;

This will generate more than two entries in @list if you have extra pipe symbols in the tail of the line. To avoid that, use:

$line =~ m/^([^|]+)\|(.*)$/;
my(@list) = ($1, $2);

The dollar in the regex is arguably superfluous, but also documents 'end of line'.

Jonathan Leffler
I'd suggest rather using "my @list = split /\|/, $line, 2" to force splitting into two strings.
tsee
Never use $1 unless you've checked the success of the match! If the match fails, you get a stale $1. Bad idea.
Randal Schwartz
+5  A: 

Have you considered using App::Ack instead? Instead of shelling out to an external program, you can just use Perl instead. Unfortunately, you'll have to read through the ack program code to really get a sense of how to do this, but you should get a more portable program as a result.

Ovid