[Last Call] Learn how to a build a cloud-first strategyRegister Now


E-mail validation using regular expression

Posted on 2007-10-16
Medium Priority
Last Modified: 2009-12-16
I am attempting to verify that the domain ends with at least two characters and no more than 3.

I've been doing some testing and am not getting the expected results.

if ($mailcheck=~/[a-zA-Z]{2,4}?$/){
 print 1;
} else {
print 2;

this should print a 1 which it does.  If I change the domain to domain.zzzzz which is more than 4 characters long, it still prints a 1.  I'm basing this around:
 {n,m}? Must occur at least n times but no more than m times
Question by:Bob-Villa

Accepted Solution

jonathanmelnick earned 200 total points
ID: 20086002
here is an example that should be able to work with.  It matches practically any form of email adress I believe.

if ($mailcheck=~/^\w[-.\w]+\@[-.\w]+\.[a-zA-Z]{2,4}$/){
 print 1;
} else {
 print 2;

btw,  \w is the equivalent of [a-zA-Z0-9_]

Author Comment

ID: 20087354
I apreciate the reply but we already use RFC822 perl module but the RFC822 doesn't limit to 2,3 or 4 . Your sample code does work but I actually want to know why my code doesn't. I just want to check that the string ends in at least 2 characters but not more than 4.

if ($mailcheck=~/[a-zA-Z]{2,4}?$/){

returns true if less than two but not if it is more than 4. WHy?
LVL 85

Expert Comment

ID: 20087379
      How do I check a valid mail address?

       You can't, at least, not in real time.  Bummer, eh?

       Without sending mail to the address and seeing whether there's a human
       on the other end to answer you, you cannot determine whether a mail
       address is valid.  Even if you apply the mail header standard, you can
       have problems, because there are deliverable addresses that aren't
       RFC-822 (the mail header standard) compliant, and addresses that aren't
       deliverable which are compliant.

       You can use the Email::Valid or RFC::RFC822::Address which check the
       format of the address, although they cannot actually tell you if it is
       a deliverable address (i.e. that mail to the address will not bounce).
       Modules like Mail::CheckUser and Mail::EXPN try to interact with the
       domain name system or particular mail servers to learn even more, but
       their methods do not work everywhere---especially for security con-
       scious administrators.

       Many are tempted to try to eliminate many frequently-invalid mail
       addresses with a simple regex, such as "/^[\w.-]+\@(?:[\w-]+\.)+\w+$/".
       It's a very bad idea.  However, this also throws out many valid ones,
       and says nothing about potential deliverability, so it is not sug-
       gested.  Instead, see http://www.cpan.org/authors/Tom_Chris-
       tiansen/scripts/ckaddr.gz , which actually checks against the full RFC
       spec (except for nested comments), looks for addresses you may not wish
       to accept mail to (say, Bill Clinton or your postmaster), and then
       makes sure that the hostname given can be looked up in the DNS MX
       records.  It's not fast, but it works for what it tries to do.

       Our best advice for verifying a person's mail address is to have them
       enter their address twice, just as you normally do to change a pass-
       word.  This usually weeds out typos.  If both versions match, send mail
       to that address with a personal message that looks somewhat like:

           Dear someuser@host.com,

           Please confirm the mail address you gave us Wed May  6 09:38:41
           MDT 1998 by replying to this message.  Include the string
           "Rumpelstiltskin" in that reply, but spelled in reverse; that is,
           start with "Nik...".  Once this is done, your confirmed address will
           be entered into our records.

       If you get the message back and they've followed your directions, you
       can be reasonably assured that it's real.

       A related strategy that's less open to forgery is to give them a PIN
       (personal ID number).  Record the address and PIN (best that it be a
       random one) for later processing.  In the mail you send, ask them to
       include the PIN in their reply.  But if it bounces, or the message is
       included via a ``vacation'' script, it'll be there anyway.  So it's
       best to ask them to mail back a slight alteration of the PIN, such as
       with the characters reversed, one added or subtracted to each digit,

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

LVL 85

Expert Comment

ID: 20087415
 DB<8> use diagnostics; use strict; $mailcheck="user@domain.com";
Possible unintended interpolation of @domain in string at (eval
        10)[/System/Library/Perl/5.8.6/perl5db.pl:628] line 2 (#1)
    (W ambiguous) You said something like `@foo' in a double-quoted string
    but there was no array @foo in scope at the time. If you wanted a
    literal @foo, then write it as \@foo; otherwise find out what happened
    to the array you apparently lost track of.
Possible unintended interpolation of @domain in string at (eval 10)[/System/Library/Perl/5.8.6/perl5db.pl:628] line 2.

                                                                                                                                 DB<9> x $mailcheck
0  'user.com'
                                                                                                                                 DB<10> x $mailcheck=~/[a-zA-Z]{2,4}?$/
0  1
LVL 85

Expert Comment

ID: 20087444
 DB<12> $mailcheck = 'domain.zzzzz'

                                                                                                                                 DB<13> x  $mailcheck =~ /(?<![a-zA-Z])[a-zA-Z]{2,4}$/
  empty array

LVL 85

Assisted Solution

ozo earned 200 total points
ID: 20087465
> It matches practically any form of email adress I believe.

> if ($mailcheck=~/^\w[-.\w]+\@[-.\w]+\.[a-zA-Z]{2,4}$/){

It does not match
The Fred and Barney Comedy Team <fred&barney@stonehenge.com>
which is  valid email address
LVL 48

Expert Comment

ID: 20088709
That's why the full  RFC822 regex is probably the most complicated regex you're ever likely to come across.
LVL 39

Expert Comment

ID: 20123549
To make your regex match, you need to check for the "." character.  The domain.zzzzz ends in 2 to 4 letters, so it still matched.

if ($mailcheck=~/\.[a-zA-Z]{2,4}?$/){
 print 1;
} else {
print 2;


Featured Post

Free Tool: Subnet Calculator

The subnet calculator helps you design networks by taking an IP address and network mask and returning information such as network, broadcast address, and host range.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Many time we need to work with multiple files all together. If its windows system then we can use some GUI based editor to accomplish our task. But what if you are on putty or have only CLI(Command Line Interface) as an option to  edit your files. I…
I have been pestered over the years to produce and distribute regular data extracts, and often the request have explicitly requested the data be emailed as an Excel attachement; specifically Excel, as it appears: CSV files confuse (no Red or Green h…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
Six Sigma Control Plans
Suggested Courses

829 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question