<

Still celebrating National IT Professionals Day with 3 months of free Premium Membership. Use Code ITDAY17

x

Email validation using Regular Expression in Perl

Published on
12,335 Points
6,335 Views
Last Modified:
Email validation in proper way is  very important validation required in any web pages.
This code is self explainable except that Regular Expression which I used for pattern matching.

I originally published as a thread on my website : http://www.aliencoders.com/content/few-important-validations-using-regular-expression-perl and thought I would share it here within Experts Exchange.

First of all we should know the rules and regulation for using email address.
Email could be anything but it should have some boundary like
1. It will have @ and .
2. Username can be a mixture of character, digit and _ (usually) of any length but it should start with only character i.e from a-z or A-Z (I restrcited length from 8-15, yu can put your own).
3. Domain Name should be between 2 to 63
4. after the last . there should not be @ and it could be in range of 2-4 only like
aliencoders@mailing.commonworld is wrong

 ######################### 
# Author: Sanjeev Kumar Jaiswal     # 
# Date: 31 March 2010                      # 
# Purpose: Email Validation               # 
############################## 
 
#!C:/strawberry/perl/bin/perl.exe   #change shebang according to your path settings
use strict; 
use warnings; 
#  Taking Input from keyborad 
my $email = <STDIN>; 
 
# Removing the trailing new line 
chomp($email); 
 
# Pattern for Email validation 
my $pattern= '^([a-zA-Z][\w\_\.]{6,15})\@([a-zA-Z0-9.-]+)\.([a-zA-Z]{2,4})$'; 
 
# To find the length of Username from the emailid (i.e. aliencoders@aliencoders.com  The value before @ is username) 
my @firstval=split('@',$email); 
my $len=length($firstval[0]); 
 
# Matching and displaying the result accordingly 
if($len>15 || $len<6) 
{ 
    print "Invalid email id.\nLength of Username is $len which is either >15 or <6"; 
    exit;     
} 
if($email=~m /$pattern/) 
{ 
    my $domain = $2; 
    if($domain=~ /^\-|\-$/) 
    { 
    print "Domain name can't start or end with -"; 
    exit; 
    } 
        if($domain=~ /^\d/){ 
    print "Domain Name can't start with Digit"; 
    exit; 
    } 
    if(length($domain)>63 || length($domain) <2) 
    { 
    print "According to domain rule Domain length should lie between 3 and 63"; 
    exit; 
    }     
print "Its a valid Email ID"; 
} 
else 
{ 
print "invalid email format"; 
}
###### End of the code ############

Open in new window

I am checking proper email condition and then checking for email id length which should be between 6 and 15 (ex: jassics@aliencoders.com    So, here my email id length is 7 <jassics>)

This line is the important line to understand.
 my $pattern= '^([a-zA-Z][\w\_]{6,15})\@([a-zA-Z0-9.-]+)\.([a-zA-Z]{2,4})$'; 

Open in new window


Whatever 4 points I mentioned before writing the code, I just used those rules to write this pattern.
^ means It should start with
[] means range so [a-zA-Z] means any character from a-z or A-Z
\w is for word which includes digits, alphabets . It's a short form in Perl to write for word (more than one character in simpler way)
{} define range
+ means at least one or more
$ means it should end with
() means group, whatever will be coming in this group will be captured by Perl-s inbuilt variable like $1 , $2 etc...

So ^([a-zA-Z][\w\_]{6,15}) means , it should start with any character from a-z or A-Z followed by word and _ and it should be between 6 and 15
\@ as @ meant for array declaration i nperl so we should avoid it by using backslash to get it's normal meaning. Thats why we used \@

I think rest part of the code will be understood as it uses above concepts only
0
Comment
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 2
4 Comments
 
LVL 84

Expert Comment

by:ozo
Valid email addresses are not restricted to [\w\_\.] and do not need to start with only [a-zA-Z]
In particular <fred&barney@stonehenge.com> is a valid email address.

see
perldoc -q 'How do I check a valid mail address'
for the correct email address standard, and the correct way to do email address validation
0
 
LVL 84

Expert Comment

by:ozo
also, username is not restricted by $len>15 || $len<6
and by the way, there's no need for \ in front of _ or . in [\w\_\.]
0
 
LVL 6

Author Comment

by:Sanjeev Jaiswal
Yes read the second point:
Username can be a mixture of character, digit and _ (usually) of any length but it should start with only character i.e from a-z or A-Z (I restrcited length from 8-15, you can put your own).


username could be anything but its good if it would start with a-z or A-Z for your comfort. Thats why I wrote like this.
There can many possible solutions for it.

but how may of users use exceptional email ids?
I was just trying to give an idea that how you can write regex in Perl for email validation and everyone is free to modify it according to their use.

I use it in one of our portal where restriction is above mentioned.
ex: gmail never allows special charatcer. so what do we say then?
RFC is different and customizing it according to our need i different bro.

Thanks for the comment by the way :D
0
 
LVL 28

Expert Comment

by:FishMonger
IMO, it would be better to use one of the email validation modules on cpan which include much better and more extensive validation.

Here's the first one that I thought of:
Email::Valid - Check validity of Internet email addresses
0

Featured Post

Concerto Cloud for Software Providers & ISVs

Can Concerto Cloud Services help you focus on evolving your application offerings, while delivering the best cloud experience to your customers? From DevOps to revenue models and customer support, the answer is yes!

Learn how Concerto can help you.

Join & Write a Comment

Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
Six Sigma Control Plans

Keep in touch with Experts Exchange

Tech news and trends delivered to your inbox every month