Link to home
Start Free TrialLog in
Avatar of GaryW021199
GaryW021199

asked on

How to detect bounced email address

I'm writing a mailing/discussion list application in PHP.  I have outgoing mail working, and I'm capturing bounce messages.  Now I need to figure out how to process those bounce messages, so I can remove bouncing email addresses from future mailings.

My client is currently using SmartList for the discussion list, and I notice that sometimes a bounce message will contain the entire original message, which in turn can include a multitude of quotes from earlier messages in the thread.  Thus, there can be lots of different email addresses in the body of the message and all the quotes.

How do some of those commercial packages that can do this determine things like hard bounces, soft bounces, etc.?

Or to simplify, how to simply tell what address bounced?

Is there any current PHP script or class for doing so?
ASKER CERTIFIED SOLUTION
Avatar of Marcus Bointon
Marcus Bointon
Flag of France image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of GaryW021199
GaryW021199

ASKER

I think I can handle VERP addressing.  However, it means changing the way my messages go out.  Currently, I am sending out the mail using multiple BCC recipients, so I only need to call the PHP mail() function once for every 50 recipients.  To implement VERP, I'd have to call mail() once per recipient.  This will undoubtedly have performance implications, and this application is for 4 active discussion lists.

It has been suggested that I use SMTP instead of direct mail(), but my web host, unfortunately, insists on using POP-before-SMTP authentication.

If I solve all THESE problems, then I'm thinking that I wouldn't really care about parsing the bounces if VERP can tell me where they came from.  If it came back, it's a bounce, and I wouldn't care why it bounced.  I would keep a count of the number of times each address bounced and the last date/time it bounced.  Addresses would be removed from the list if they bounce X number of times within Y days.  If a bounce is received and the previous bounce is more than Y days ago, the count is reset.
It's true that VERP implies sending one message at a time, however, you can reduce the inefficiency by not using it all the time. Ezmlm, the 'official' qmail mailing list manager, supports occasional VERP address checking, where it sends out say 1 in 10 mailings using VERP, but otherwise doesn't bother. This means that it will miss some bounces, but statistically it will still detect the persistent ones.

Also beware of saying if it bounces it's a bounce - some dumb MTAs will send vacation mail notifications and delivery delay messages back to the sender address, neither of which are real bounces. Incidentally, if you set a 'Precedence: Bulk' header, many MTAs and MUAs will usefully abstain from sending vacation messages. It's also worth noting that qsbmf bounces are ALWAYS permanent failures - that's part of qmail policy.
Wouldn't a "Precedence: Bulk" header cause spam filters to reject it (even though it's an opt-in mailing list)?
No, you'll find that most mailing lists use that header - just looking in my in-box, I see that the PHP developer and internals mailing lists do, so does Apple's 'QuickTime news', which is typical of a high-volume opt-in list.
I'm writing a mailing list program too. I'm detecting bounces by requiring users to start their subject lines with the verb "post". There are other supported verbs, such as "help" and "subscribe".

Incidently, I found that using PHP "mail" sent bounced email to our default mailbox instead of the mailing list address. I solved this by doing what is done in other mailing list programs: using SMTP directly.

David
A better way to detect bounced email that avoids pattern recognition of 'bounciness': include in the sent email a header like this:

X-Mailer-Bounce-From: recipient@example.com

Then, in the email-handling script, search the entire received email for this header:

if (preg_match("@^X-Mailer-Bounce-From: (.*?)$@mis", $Email, $M))
      $BouncedFrom=$M[1];
else
      $BouncedFrom="";

$BouncedFrom will have the recipient of the sent email if it bounced.

David
Is the X-Mailer-Bounce-From header always going to be returned by the bouncer?  Why is this better than using VERP addresses?  I still have to send out individualized emails this way, right?  (Sorry, I'm not trying to be antagonistic, just asking).
It's not better than VERP. Some bounces include the original message, some don't. VERP has the huge advantage that if you receive a bounce, then you are absolutely guaranteed to know exactly who it was sent to, even if not one single byte of the original message (or headers) is returned to you. If you keep sufficient records, and use per-message hashes (not just per-address) in VERP, you can also tell what they were sent. If they choose to ignore the return path, then you'll never receive the bounce anyway (and they should trash that crummy mail server immediately)!

Although VERP guarantees you get to know who the bounced message was to, it doesn't tell you why it bounced (and nor can X-Mailer-Bounce), so you still have to read the message to find out and you will still need to do pattern recognition on the body of the message.
It is true that my mailing list program sends individual emails instead of using "bcc". It is also true that I am relying on bounces containing the original message headers. These limitations are acceptable because my lists are small and because no harm is done if bounces are not detected.

VERP is totally superior when it can be implemented.

However, I am using shared hosting, so I cannot control the local server email program and hence cannot use VERP. I only have one forwarding email address that I pipe into my PHP script.
I am also using shared hosting.  Fortunately, it is a Linux server that allows me to create my own procmail recipes, so I can easily implement VERP.