Solved

SED Email Validation

Posted on 2010-09-05
7
770 Views
Last Modified: 2013-12-26
Hi,

I have a large list of emails and I need a script that will validate the syntax, MX and DNS records. I run Ubuntu through a virtual box and ideally need a script I can execute on the command line.

Thank you in advance
0
Comment
Question by:faithless1
  • 5
  • 2
7 Comments
 
LVL 6

Expert Comment

by:apresence
Comment Utility
This would be a bit of a pain to do in SED.  Here's how to do it in perl:
perl -ne 'exit 1 if (!/^\w[\w\-\_\.]*\@\w[\w\-\_]*(\.\w[\w\-\_]*)+$/)'

Returns 0 if the e-mail address is valid, or 1 if it's not.

Testing:
root@beta:~/exex $ echo foo | perl -ne 'exit 1 if (!/^\w[\w\-\_\.]*\@\w[\w\-\_]*(\.\w[\w\-\_]*)+$/)'; echo $?
1
root@beta:~/exex $ echo foo@bar | perl -ne 'exit 1 if (!/^\w[\w\-\_\.]*\@\w[\w\-\_]*(\.\w[\w\-\_]*)+$/)'; echo $?
1
root@beta:~/exex $ echo foo@bar.com | perl -ne 'exit 1 if (!/^\w[\w\-\_\.]*\@\w[\w\-\_]*(\.\w[\w\-\_]*)+$/)'; echo $?
0
root@beta:~/exex $ echo 0oo@bar.com | perl -ne 'exit 1 if (!/^\w[\w\-\_\.]*\@\w[\w\-\_]*(\.\w[\w\-\_]*)+$/)'; echo $?
0
root@beta:~/exex $ echo 0oo@bar.com | perl -ne 'exit 1 if (!/^\w[\w\-\_\.]*\@\w[\w\-\_]*(\.\w[\w\-\_]*)+$/)'; echo $?
0
root@beta:~/exex $


If you want to go a step further and check for .com/.net/.org, etc. do this (Make sure EVERY domain suffix you want to allow is listed!  .biz and .tv etc...)
perl -ne 'exit 1 if (!/^\w[\w\-\_\.]*\@\w[\w\-\_]*(\.\w[\w\-\_]*)*(\.com|\.net|\.org)$/)'

Testing:
root@beta:~/exex $ echo foo@bar.com | perl -ne 'exit 1 if (!/^\w[\w\-\_\.]*\@\w[\w\-\_]*(\.\w[\w\-\_]*)*(\.com|\.net|\.org)$/)'; echo $?
0
root@beta:~/exex $ echo foo@bar.foo | perl -ne 'exit 1 if (!/^\w[\w\-\_\.]*\@\w[\w\-\_]*(\.\w[\w\-\_]*)*(\.com|\.net|\.org)$/)'; echo $?
1
0
 
LVL 6

Expert Comment

by:apresence
Comment Utility
The above just validates the format of the e-mail address, not the MX/DNS records... please ignore.
0
 
LVL 6

Expert Comment

by:apresence
Comment Utility
Check this out for validating the MX/DNS records (again, perl not SED):
http://www.usenix.org/publications/perl/perl17.html

SED is really only suited for editing/validating text, not for doing things like querying domain servers.  Perl is a much better option for this.
        sub valid_address {

        	my($addr) = @_;

        	my($domain, $valid);

         	return(0) unless ($addr =~ /^[^@]+@([-\w]+\.)+[A-Za-z]

        					{2,4}$/);

        	$domain = (split(/@/, $addr))[1];

        	$valid = 0; open(DNS, "nslookup -q=any $domain |") ||

        					return(-1);

        	while (<DNS>) {

        		$valid = 1 if (/^$domain.*\s(mail exchanger|

        					internet address)\s=/);

        	}

        	return($valid);

        }

Open in new window

0
What Security Threats Are You Missing?

Enhance your security with threat intelligence from the web. Get trending threat insights on hackers, exploits, and suspicious IP addresses delivered to your inbox with our free Cyber Daily.

 

Author Comment

by:faithless1
Comment Utility
Thanks. How would I run these scripts on the command line in Gnome terminal? I have a file with 100k emails named (emails.txt). Thanks
0
 
LVL 6

Expert Comment

by:apresence
Comment Utility
Drop the attached code into a file called validate_emails.pl.  Make sure to run "chmod 700 validate_emails.pl" to mark it as executable.

To process your text file and see the output, just use:
validate_emails.pl < emails.txt

To process your text file and save the output, just use:
validate_emails.pl < emails.txt > output.txt

Sample testing output:
root@beta:~/exex/test13 $ cat emails.txt
foo
foo@bar
foo@bar.baz
support@microsoft.com
root@beta:~/exex/test13 $ ./validate_emails.pl <emails.txt
foo invalid
foo@bar invalid
foo@bar.baz invalid
support@microsoft.com valid
root@beta:~/exex/test13 $
#!/usr/bin/perl



sub valid_address {

  my($addr) = @_;

  my($domain, $valid);

  return(0) unless ($addr =~ /^[^@]+@([-\w]+\.)+[A-Za-z]{2,4}$/);

  $domain = (split(/@/, $addr))[1];

  $valid = 0; open(DNS, "nslookup -q=any $domain |") || return(-1);

  while (<DNS>) {

    $valid = 1 if (/^$domain.*\s(mail exchanger|internet address)\s=/);

  }

  return($valid);

}



while (<>) {

  $addy = $_;

  $addy =~ s/\s+$//;

  if ($addy)

  {

    print "$addy " . (valid_address($addy) ? 'valid' : 'invalid') . "\n";

  }

}

Open in new window

0
 
LVL 6

Accepted Solution

by:
apresence earned 500 total points
Comment Utility
Since you are going to be checking 100k e-mails, there is almost certainly going to be some duplicate lookups.  In order to optimize the lookups, I've attached a new version of the script that caches the results of the last lookup.  Should make your script run faster.

Uncomment the following like to get the cache hit information:
#print "[cached_result] ";

Testing with that line uncommented:
root@beta:~/exex/test13 $ cat emails.txt
foo
foo@bar
bar@bar
foo@barific.baz
bar@barific.baz
support@microsoft.com
abuse@microsoft.com
root@beta:~/exex/test13 $ ./validate_emails.pl <emails.txt
foo invalid
foo@bar invalid
bar@bar invalid
foo@barific.baz invalid
[cached_result] bar@barific.baz invalid
support@microsoft.com valid
[cached_result] abuse@microsoft.com valid
root@beta:~/exex/test13 $
#!/usr/bin/perl

use Data::Dumper;



%lookup_cache = ();



sub valid_address {

  my($addr) = @_;

  my($domain, $valid);



  # Lower-case address

  $addr = lc($addr);



  # Validate format of address

  return(0) unless ($addr =~ /^[^@]+@([-\w]+\.)+[a-z]{2,4}$/);



  # Grab domain

  $domain = (split(/@/, $addr))[1];



  # Lookup and return cached result if it exists

  $cached_result = $lookup_cache{$domain};

  if ($cached_result ne '')

  {

    #print "[cached_result] ";

    return $cached_result;

  }



  # Do domain lookup

  $valid = 0;

  if (open(DNS, "nslookup -q=any $domain |"))

  {

    while (<DNS>) {

      $valid = 1 if (/^$domain.*\s(mail exchanger|internet address)\s=/i);

    }

  }



  # Store cached result for later

  $lookup_cache{$domain} = $valid;



  return $valid;

}



while (<>) {

  $addy = $_;

  $addy =~ s/\s+$//;

  if ($addy)

  {

    print "$addy " . (valid_address($addy) ? 'valid' : 'invalid') . "\n";

  }

}

Open in new window

0
 

Author Comment

by:faithless1
Comment Utility
Superb, thanks a million! I appreciate it.
0

Featured Post

Maximize Your Threat Intelligence Reporting

Reporting is one of the most important and least talked about aspects of a world-class threat intelligence program. Here’s how to do it right.

Join & Write a Comment

The following is a collection of cases for strange behaviour when using advanced techniques in DOS batch files. You should have some basic experience in batch "programming", as I'm assuming some knowledge and not further explain the basics. For some…
Over the years I've spent many an hour playing on hardened, DMZ'd servers, with only a sub-set of the usual GNU toy's to keep me company; frequently I've needed to save and send log or data extracts from these server back to my PC, or to others, and…
Learn several ways to interact with files and get file information from the bash shell. ls lists the contents of a directory: Using the -a flag displays hidden files: Using the -l flag formats the output in a long list: The file command gives us mor…
You have products, that come in variants and want to set different prices for them? Watch this micro tutorial that describes how to configure prices for Magento super attributes. Assigning simple products to configurable: We assigned simple products…

763 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

6 Experts available now in Live!

Get 1:1 Help Now