views:

313

answers:

3

I am looking for a regular expression (or other method if there is such a thing) for detecting bounce email messages. So far I have been going through our unattended mail box and adding strings that I find into a regex. I figured someone would have something that is already complete rather than me re-inventing the wheel.

Here is an example of what I have so far:

/reason: 550|permanent fatal errors|Error 550|Action: Failed|Mailbox does not exist|Delivery to the following recipients failed/i
+1  A: 

Email servers are too varied for this to work 100%, but you might have better luck if you were looking in the headers of the message, instead of it's body, as the headers are meant to be machine readable, unlike the body.

I'd start by looking for any headers with 'error' in them.

zigdon
If you've ever been the victim of mass mailing (by victim, I mean some asshat did the mass mailing with your email address as the 'from' address) then you'll see the astonishing variety of potential bounce email subject lines.
Jherico
+1  A: 

It may be overkill for your case, but the most accurate solution is probably to use a spam filtering tool: they all need to be able to handle bounces gracefully, and they will have put a lot of effort into reducing false positives.

I would suggest SpamAssassin, personally. It is packaged as a perl module with a command-line interface "spamassassin" that can probably be coerced to do what you need it to. The bounce message rule is called (unsurprisingly) BOUNCE_MESSAGE. It is, unfortunately, not as simple as a regular expression you can copy.

dlowe
A: 

You're probably better off looking at the full headers for some bounced messages and identifying common elements in the X headers that the server may have included. This is going to get you a lot less false-positives than subject line parsing.

Jherico