Email validation using Regular Expression in Perl

Published:
Email validation in proper way is  very important validation required in any web pages.
This code is self explainable except that Regular Expression which I used for pattern matching.

I originally published as a thread on my website : http://www.aliencoders.com/content/few-important-validations-using-regular-expression-perl and thought I would share it here within Experts Exchange.

First of all we should know the rules and regulation for using email address.
Email could be anything but it should have some boundary like
1. It will have @ and .
2. Username can be a mixture of character, digit and _ (usually) of any length but it should start with only character i.e from a-z or A-Z (I restrcited length from 8-15, yu can put your own).
3. Domain Name should be between 2 to 63
4. after the last . there should not be @ and it could be in range of 2-4 only like
aliencoders@mailing.commonworld is wrong

 ######################### 
                      # Author: Sanjeev Kumar Jaiswal     # 
                      # Date: 31 March 2010                      # 
                      # Purpose: Email Validation               # 
                      ############################## 
                       
                      #!C:/strawberry/perl/bin/perl.exe   #change shebang according to your path settings
                      use strict; 
                      use warnings; 
                      #  Taking Input from keyborad 
                      my $email = <STDIN>; 
                       
                      # Removing the trailing new line 
                      chomp($email); 
                       
                      # Pattern for Email validation 
                      my $pattern= '^([a-zA-Z][\w\_\.]{6,15})\@([a-zA-Z0-9.-]+)\.([a-zA-Z]{2,4})$'; 
                       
                      # To find the length of Username from the emailid (i.e. aliencoders@aliencoders.com  The value before @ is username) 
                      my @firstval=split('@',$email); 
                      my $len=length($firstval[0]); 
                       
                      # Matching and displaying the result accordingly 
                      if($len>15 || $len<6) 
                      { 
                          print "Invalid email id.\nLength of Username is $len which is either >15 or <6"; 
                          exit;     
                      } 
                      if($email=~m /$pattern/) 
                      { 
                          my $domain = $2; 
                          if($domain=~ /^\-|\-$/) 
                          { 
                          print "Domain name can't start or end with -"; 
                          exit; 
                          } 
                              if($domain=~ /^\d/){ 
                          print "Domain Name can't start with Digit"; 
                          exit; 
                          } 
                          if(length($domain)>63 || length($domain) <2) 
                          { 
                          print "According to domain rule Domain length should lie between 3 and 63"; 
                          exit; 
                          }     
                      print "Its a valid Email ID"; 
                      } 
                      else 
                      { 
                      print "invalid email format"; 
                      }
                      ###### End of the code ############

Open in new window

I am checking proper email condition and then checking for email id length which should be between 6 and 15 (ex: jassics@aliencoders.com    So, here my email id length is 7 <jassics>)

This line is the important line to understand.
 my $pattern= '^([a-zA-Z][\w\_]{6,15})\@([a-zA-Z0-9.-]+)\.([a-zA-Z]{2,4})$'; 

Open in new window


Whatever 4 points I mentioned before writing the code, I just used those rules to write this pattern.
^ means It should start with
[] means range so [a-zA-Z] means any character from a-z or A-Z
\w is for word which includes digits, alphabets . It's a short form in Perl to write for word (more than one character in simpler way)
{} define range
+ means at least one or more
$ means it should end with
() means group, whatever will be coming in this group will be captured by Perl-s inbuilt variable like $1 , $2 etc...

So ^([a-zA-Z][\w\_]{6,15}) means , it should start with any character from a-z or A-Z followed by word and _ and it should be between 6 and 15
\@ as @ meant for array declaration i nperl so we should avoid it by using backslash to get it's normal meaning. Thats why we used \@

I think rest part of the code will be understood as it uses above concepts only
0
11,264 Views

Comments (4)

ozo
CERTIFIED EXPERT
Most Valuable Expert 2014
Top Expert 2015

Commented:
Valid email addresses are not restricted to [\w\_\.] and do not need to start with only [a-zA-Z]
In particular <fred&barney@stonehenge.com> is a valid email address.

see
perldoc -q 'How do I check a valid mail address'
for the correct email address standard, and the correct way to do email address validation
ozo
CERTIFIED EXPERT
Most Valuable Expert 2014
Top Expert 2015

Commented:
also, username is not restricted by $len>15 || $len<6
and by the way, there's no need for \ in front of _ or . in [\w\_\.]

Author

Commented:
Yes read the second point:
Username can be a mixture of character, digit and _ (usually) of any length but it should start with only character i.e from a-z or A-Z (I restrcited length from 8-15, you can put your own).


username could be anything but its good if it would start with a-z or A-Z for your comfort. Thats why I wrote like this.
There can many possible solutions for it.

but how may of users use exceptional email ids?
I was just trying to give an idea that how you can write regex in Perl for email validation and everyone is free to modify it according to their use.

I use it in one of our portal where restriction is above mentioned.
ex: gmail never allows special charatcer. so what do we say then?
RFC is different and customizing it according to our need i different bro.

Thanks for the comment by the way :D
CERTIFIED EXPERT

Commented:
IMO, it would be better to use one of the email validation modules on cpan which include much better and more extensive validation.

Here's the first one that I thought of:
Email::Valid - Check validity of Internet email addresses

Have a question about something in this article? You can receive help directly from the article author. Sign up for a free trial to get started.