Email validation using Regular Expression in Perl

Published on
14,047 Points
Last Modified:
Email validation in proper way is  very important validation required in any web pages.
This code is self explainable except that Regular Expression which I used for pattern matching.

I originally published as a thread on my website : http://www.aliencoders.com/content/few-important-validations-using-regular-expression-perl and thought I would share it here within Experts Exchange.

First of all we should know the rules and regulation for using email address.
Email could be anything but it should have some boundary like
1. It will have @ and .
2. Username can be a mixture of character, digit and _ (usually) of any length but it should start with only character i.e from a-z or A-Z (I restrcited length from 8-15, yu can put your own).
3. Domain Name should be between 2 to 63
4. after the last . there should not be @ and it could be in range of 2-4 only like
aliencoders@mailing.commonworld is wrong

# Author: Sanjeev Kumar Jaiswal     # 
# Date: 31 March 2010                      # 
# Purpose: Email Validation               # 
#!C:/strawberry/perl/bin/perl.exe   #change shebang according to your path settings
use strict; 
use warnings; 
#  Taking Input from keyborad 
my $email = <STDIN>; 
# Removing the trailing new line 
# Pattern for Email validation 
my $pattern= '^([a-zA-Z][\w\_\.]{6,15})\@([a-zA-Z0-9.-]+)\.([a-zA-Z]{2,4})$'; 
# To find the length of Username from the emailid (i.e. aliencoders@aliencoders.com  The value before @ is username) 
my @firstval=split('@',$email); 
my $len=length($firstval[0]); 
# Matching and displaying the result accordingly 
if($len>15 || $len<6) 
    print "Invalid email id.\nLength of Username is $len which is either >15 or <6"; 
if($email=~m /$pattern/) 
    my $domain = $2; 
    if($domain=~ /^\-|\-$/) 
    print "Domain name can't start or end with -"; 
        if($domain=~ /^\d/){ 
    print "Domain Name can't start with Digit"; 
    if(length($domain)>63 || length($domain) <2) 
    print "According to domain rule Domain length should lie between 3 and 63"; 
print "Its a valid Email ID"; 
print "invalid email format"; 
###### End of the code ############

Open in new window

I am checking proper email condition and then checking for email id length which should be between 6 and 15 (ex: jassics@aliencoders.com    So, here my email id length is 7 <jassics>)

This line is the important line to understand.
 my $pattern= '^([a-zA-Z][\w\_]{6,15})\@([a-zA-Z0-9.-]+)\.([a-zA-Z]{2,4})$'; 

Open in new window

Whatever 4 points I mentioned before writing the code, I just used those rules to write this pattern.
^ means It should start with
[] means range so [a-zA-Z] means any character from a-z or A-Z
\w is for word which includes digits, alphabets . It's a short form in Perl to write for word (more than one character in simpler way)
{} define range
+ means at least one or more
$ means it should end with
() means group, whatever will be coming in this group will be captured by Perl-s inbuilt variable like $1 , $2 etc...

So ^([a-zA-Z][\w\_]{6,15}) means , it should start with any character from a-z or A-Z followed by word and _ and it should be between 6 and 15
\@ as @ meant for array declaration i nperl so we should avoid it by using backslash to get it's normal meaning. Thats why we used \@

I think rest part of the code will be understood as it uses above concepts only
  • 2
LVL 85

Expert Comment

Valid email addresses are not restricted to [\w\_\.] and do not need to start with only [a-zA-Z]
In particular <fred&barney@stonehenge.com> is a valid email address.

perldoc -q 'How do I check a valid mail address'
for the correct email address standard, and the correct way to do email address validation
LVL 85

Expert Comment

also, username is not restricted by $len>15 || $len<6
and by the way, there's no need for \ in front of _ or . in [\w\_\.]

Author Comment

by:Sanjeev Jaiswal
Yes read the second point:
Username can be a mixture of character, digit and _ (usually) of any length but it should start with only character i.e from a-z or A-Z (I restrcited length from 8-15, you can put your own).

username could be anything but its good if it would start with a-z or A-Z for your comfort. Thats why I wrote like this.
There can many possible solutions for it.

but how may of users use exceptional email ids?
I was just trying to give an idea that how you can write regex in Perl for email validation and everyone is free to modify it according to their use.

I use it in one of our portal where restriction is above mentioned.
ex: gmail never allows special charatcer. so what do we say then?
RFC is different and customizing it according to our need i different bro.

Thanks for the comment by the way :D
LVL 28

Expert Comment

IMO, it would be better to use one of the email validation modules on cpan which include much better and more extensive validation.

Here's the first one that I thought of:
Email::Valid - Check validity of Internet email addresses

Featured Post

OWASP Proactive Controls

Learn the most important control and control categories that every architect and developer should include in their projects.

Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
Six Sigma Control Plans
Next Article:

Keep in touch with Experts Exchange

Tech news and trends delivered to your inbox every month