• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 4724
  • Last Modified:

Email Bounce Processing

I'm designing the user registration system for a new website.  To get me going in the right direction, I want advice on a specific part of this system: detecting bounced confirmation emails.

This website will be running on a virtual hosting account that has a cPanel-type installation.

Here is my plan so far:

Write a PHP script that is run nightly by crontab.  This PHP script connects to the POP or IMAP account where bounces would be received, and reads all messages.  For each message, the script searches for an email address inside < > brackets near the top of the message body.  The script also looks for the confirmation code that was included in the original message.  If such a message is positively identified, the pending registration would be cleaned out of the database.

Things I am concerned about:

I am sending out the confirmation mails using PHP and the localhost SMTP server.  Can I assume that the SMTP server will accept all mails and any errors would be received as bounces?  If not, which SMTP errors should I look for?  (for example, would a malformed TO: address generate a bounce, or an SMTP error instead?)

The POP or IMAP account for bounces is probably going to receive non-bounce mails sooner or later.  What can I do to ensure that most bounces are handled, and most spam is ignored?

Do email bounces always include the original message's body?  If not, what is the best way to authenticate incoming bounces?  Obviously, I do not want to blindly delete user accounts based on incoming emails.  If this procedure gets implemented in other ways (newsletters, for example), then I need to know what risks are involved in automatic bounce processing, generally.

Also, I've seen information about forwarding mail directly to a PHP script.  This sounds cool but seems impractical in a virtual host account because I think the CGI version of PHP is running the crons.  I think.  :)

Guidance greatly appreciated.
  • 4
  • 2
1 Solution

You can specify "Return-Path" in mail() header to get the bounced emails...

Hi miqrogroove,

The problem you face is that incoming emails to that account will eventually be of many types:

- Bounces
- Replies
- Viruses
- Spam
- Out of Office Autoresponders and such
- etc

You need a pretty strong regular expression to detect a large range of header types and even then, you will likely not catch them all. You could spend agonising hours researching all possible combinations and testing for them all, but I recommend a much wiser option. G-Lock email processor: http://www.glocksoft.com/ep/index.htm It has a free trial and is worth every cent of registration. Check out the features for yourself, but the ones of note are:

- Built in variety of regular expressions to catch the majority of email types. You can write your own rules and filters too.
- Header extraction.
- Write results directly to file or straight to your database via ODBC.
- Schedule multiple accounts to be checked.

The only con in my opinion is that it only runs on Windows, so you will need to have an Internet connected PC running the program for as long as you need to check the account. Otherwise, it's a godsend.

I usually have it set to do the following in order:

1. Delete virus, spam and autoresponders.
2. Check bounced emails and write the results to a database server via ODBC, then write to a csv file as backup.
3. Delete bounced emails on server.
3. The emails that are left are usually replies, so you can either set up an autoresponder yourself to say that your account is unmanned, or you can download them to a client yourself for handling by a real person.

Note that you can also use GEP to automagically handle new subscribes and unsubscribe requests as well, so it's a very powerful tool and definitely worth the trial.

Let me know if you need any more info.


miqrogrooveAuthor Commented:

  I guess you answered Part II of my question.  I need to be certain that writing this script will be time effective before I commit to it.  I will consider as a serious alternative requiring registration confirmation within 24 hours.  This would allow me to clean the database of any old registration attempts regardless of the email validity.

  As far as this G-Lock program goes, it is not an option for me because the website is hosted from a Unix virtual server.  Having a Windows machine involved would be highly impractical.

  I am hoping experts will still comment on the other parts of my question.

-- Miqro
Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

> Having a Windows machine involved would be highly impractical.

Note that this does not need to be a server, but I understand what you mean. We just use a crappy little Windows box to run the program just in the office and FreeBSD for our servers.

I do not believe that writing this script yourself would be time effective. That's just my opinion though. Just looking at the number of rules GEP uses to catch all different bounce headers and types will show you that bounces are not consistent in their appearance, so you would need long periods of trial and error to come up with a similar set of rules. Of course, you could copy GEP's rules, but I don't endorse that. Unless, you pay for a license and ask them if you could use their rules in your own regexp app.

If the purpose of bounce handling is simply to make sure people are who they say they are, then the double opt-in method would be the best. You should probably use double opt-in anyway to make sure your system is not being abused.
miqrogrooveAuthor Commented:
Regarding Part I of the question:  I determined through testing that my internal mail server will generate bounces for most commercial domains (aol.com, hotmail.com, yahoo.com) and also for SMTP errors such as unrouteable domains.  This gives me new hope, since the majority of bounces will be in a predicatble format with headers and body intact.
miqrogrooveAuthor Commented:
This answers Part IV of my question:  http://www.evolt.org/article/Incoming_Mail_and_PHP/18/27914/

I'm still looking for comments on the security issues involved.  What should I expect in the way of bounces from other servers, and in fake bounces?
miqrogrooveAuthor Commented:
Well, that's what I get for asking email questions in the PHP section :p

Featured Post

Hire Technology Freelancers with Gigs

Work with freelancers specializing in everything from database administration to programming, who have proven themselves as experts in their field. Hire the best, collaborate easily, pay securely, and get projects done right.

  • 4
  • 2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now