views:

557

answers:

4

I have a bot that replies to users. But sometimes when my bot sends its reply, the user or their email provider will auto-respond (vacation message, bounce message, error from mailer-daemon, etc). That is then a new message from the user (so my bot thinks) that it in turn replies to. Mail loop!

I'd like my bot to only reply to real emails from real humans. I'm currently filtering out email that admits to being bulk precedence or from a mailing list or has the Auto-Submitted header equal to "auto-replied" or "auto-generated" (see code below). But I imagine there's a more comprehensive or standard way to deal with this. (I'm happy to see solutions in other languages besides Perl.)

NB: Remember to have your own bot declare that it is autoresponding! Include

Auto-Submitted: auto-reply

in the header of your bot's email.

My original code for avoiding mail loops follows. Only reply if realmail returns true.

sub realmail {
  my($email) = @_;
  $email =~ /\nSubject\:\s*([^\n]*)\n/s;
  my $subject = $1;
  $email  =~ /\nPrecedence\:\s*([^\n]*)\n/s;
  my $precedence = $1;
  $email  =~ /\nAuto-Submitted\:\s*([^\n]*)\n/s;
  my $autosub = $1;

  return !($precedence =~ /bulk|list|junk/i ||
           $autosub =~ /(auto\-replied|auto\-generated)/i ||
           $subject =~ /^undelivered mail returned to sender$/i
          );
}

(The Subject check is surely unnecessary; I just added these checks one at a time as problems arose and the above now seems to work so I don't want to touch it unless there's something definitively better.)

+1  A: 

My answer here only deals with bounces which is more straightforward.

Using DSN (Delivery Status Notification) identifier will help you detect a DSN/bounced message. It should go to Return-Path and not Reply-To.

Here's a sample of a typical DSN message. The header information includes the message id, content type has specific values (delivery-status) etc.

Not able to provide you any codes in perl, just my 2 cents of idea.

PS: Do note that not all mail servers or MTA conforms to this, but I guess most do.

o.k.w
+2  A: 

That really sounds like something that's probably available as a module from CPAN, but I didn't find anything clearly relevant in five minutes of searching. Mail::Lite::Mbox::Processor looks like it might do what you want:

Mail::Lite::Message::Matcher is a framework for automated mail processing. For example you have a mail server and you have a need to process some types of incoming mail messages automatically. For example, you can extract automated notifications, invoices, alerts etc. from your mail flow and perform some tasks based on content of those messages.

but its docs are sparse enough that it isn't immediately obvious whether it provides those example functions itself or if you have to provide the code to drive them.

In any case, though, if you haven't already checked CPAN, that's where I would start if I wanted to do something like this.

Dave Sherohman
+1  A: 

There should be a standard way of dealing with this, but the problem is that you'd have to assume that systems that send auto-replies comply to that standard, when most the time, they just don't.

How do you get the address that you reply to? I hope you aren't using the From: header. Check the Reply-to: header first and if that doesn't exist, use the Return-path:.

But whatever you do, you will simply have to keep a log of what you sent to whom and throttle your bot to some sensible value of messages per time.

innaM
+7  A: 

RFC 3834 provides some guidance for what you should do, but here are some concrete guidelines:

Set your envelope sender to a different email address than your auto-responder so bounces don't feed back into the system.

I always store in a database a key of when an email response was sent from a specific address to another address. Under no circumstance will I ever respond to the same address more than once in a 10 minute period. This alone stopped all loops, but doesn't ensure nice behavior (auto-responses to mailing lists are annoying).

Make sure you add any permutation of header that other people are matching on to stop loops. Here's the list I use:


X-Loop: autoresponder
Auto-Submitted: auto-replied
Precedence: bulk (autoreply)

Here are some header regex's I use to avoid loops and to try to play nice:


 /^precedence:\s+(?:bulk|list|junk)/i
 /^X-(?:Loop|Mailing-List|BeenThere|Mailman)/i
 /^List-/i
 /^Auto-Submitted:/i
 /^Resent-/i

I also avoid responding if any of these are the envelop senders:


if ($sender eq ""
    || $sender =~ /^(?:request|owner|admin|bounce|bounces)-|-(?:request|owner|admin|bounce|bounces)\@|^(?:mailer-daemon|postmaster|daemon|majordomo|ma
ilman|bounce)\@|(?:listserv|listsrv)/i) {
Neil Neely