Solved

SED Email Validation

Posted on 2010-09-05
7
779 Views
Last Modified: 2013-12-26
Hi,

I have a large list of emails and I need a script that will validate the syntax, MX and DNS records. I run Ubuntu through a virtual box and ideally need a script I can execute on the command line.

Thank you in advance
0
Comment
Question by:faithless1
  • 5
  • 2
7 Comments
 
LVL 6

Expert Comment

by:apresence
ID: 33609005
This would be a bit of a pain to do in SED.  Here's how to do it in perl:
perl -ne 'exit 1 if (!/^\w[\w\-\_\.]*\@\w[\w\-\_]*(\.\w[\w\-\_]*)+$/)'

Returns 0 if the e-mail address is valid, or 1 if it's not.

Testing:
root@beta:~/exex $ echo foo | perl -ne 'exit 1 if (!/^\w[\w\-\_\.]*\@\w[\w\-\_]*(\.\w[\w\-\_]*)+$/)'; echo $?
1
root@beta:~/exex $ echo foo@bar | perl -ne 'exit 1 if (!/^\w[\w\-\_\.]*\@\w[\w\-\_]*(\.\w[\w\-\_]*)+$/)'; echo $?
1
root@beta:~/exex $ echo foo@bar.com | perl -ne 'exit 1 if (!/^\w[\w\-\_\.]*\@\w[\w\-\_]*(\.\w[\w\-\_]*)+$/)'; echo $?
0
root@beta:~/exex $ echo 0oo@bar.com | perl -ne 'exit 1 if (!/^\w[\w\-\_\.]*\@\w[\w\-\_]*(\.\w[\w\-\_]*)+$/)'; echo $?
0
root@beta:~/exex $ echo 0oo@bar.com | perl -ne 'exit 1 if (!/^\w[\w\-\_\.]*\@\w[\w\-\_]*(\.\w[\w\-\_]*)+$/)'; echo $?
0
root@beta:~/exex $


If you want to go a step further and check for .com/.net/.org, etc. do this (Make sure EVERY domain suffix you want to allow is listed!  .biz and .tv etc...)
perl -ne 'exit 1 if (!/^\w[\w\-\_\.]*\@\w[\w\-\_]*(\.\w[\w\-\_]*)*(\.com|\.net|\.org)$/)'

Testing:
root@beta:~/exex $ echo foo@bar.com | perl -ne 'exit 1 if (!/^\w[\w\-\_\.]*\@\w[\w\-\_]*(\.\w[\w\-\_]*)*(\.com|\.net|\.org)$/)'; echo $?
0
root@beta:~/exex $ echo foo@bar.foo | perl -ne 'exit 1 if (!/^\w[\w\-\_\.]*\@\w[\w\-\_]*(\.\w[\w\-\_]*)*(\.com|\.net|\.org)$/)'; echo $?
1
0
 
LVL 6

Expert Comment

by:apresence
ID: 33609010
The above just validates the format of the e-mail address, not the MX/DNS records... please ignore.
0
 
LVL 6

Expert Comment

by:apresence
ID: 33609026
Check this out for validating the MX/DNS records (again, perl not SED):
http://www.usenix.org/publications/perl/perl17.html

SED is really only suited for editing/validating text, not for doing things like querying domain servers.  Perl is a much better option for this.
        sub valid_address {
        	my($addr) = @_;
        	my($domain, $valid);
         	return(0) unless ($addr =~ /^[^@]+@([-\w]+\.)+[A-Za-z]
        					{2,4}$/);
        	$domain = (split(/@/, $addr))[1];
        	$valid = 0; open(DNS, "nslookup -q=any $domain |") ||
        					return(-1);
        	while (<DNS>) {
        		$valid = 1 if (/^$domain.*\s(mail exchanger|
        					internet address)\s=/);
        	}
        	return($valid);
        }

Open in new window

0
Ransomware-A Revenue Bonanza for Service Providers

Ransomware – malware that gets on your customers’ computers, encrypts their data, and extorts a hefty ransom for the decryption keys – is a surging new threat.  The purpose of this eBook is to educate the reader about ransomware attacks.

 

Author Comment

by:faithless1
ID: 33614266
Thanks. How would I run these scripts on the command line in Gnome terminal? I have a file with 100k emails named (emails.txt). Thanks
0
 
LVL 6

Expert Comment

by:apresence
ID: 33614345
Drop the attached code into a file called validate_emails.pl.  Make sure to run "chmod 700 validate_emails.pl" to mark it as executable.

To process your text file and see the output, just use:
validate_emails.pl < emails.txt

To process your text file and save the output, just use:
validate_emails.pl < emails.txt > output.txt

Sample testing output:
root@beta:~/exex/test13 $ cat emails.txt
foo
foo@bar
foo@bar.baz
support@microsoft.com
root@beta:~/exex/test13 $ ./validate_emails.pl <emails.txt
foo invalid
foo@bar invalid
foo@bar.baz invalid
support@microsoft.com valid
root@beta:~/exex/test13 $
#!/usr/bin/perl

sub valid_address {
  my($addr) = @_;
  my($domain, $valid);
  return(0) unless ($addr =~ /^[^@]+@([-\w]+\.)+[A-Za-z]{2,4}$/);
  $domain = (split(/@/, $addr))[1];
  $valid = 0; open(DNS, "nslookup -q=any $domain |") || return(-1);
  while (<DNS>) {
    $valid = 1 if (/^$domain.*\s(mail exchanger|internet address)\s=/);
  }
  return($valid);
}

while (<>) {
  $addy = $_;
  $addy =~ s/\s+$//;
  if ($addy)
  {
    print "$addy " . (valid_address($addy) ? 'valid' : 'invalid') . "\n";
  }
}

Open in new window

0
 
LVL 6

Accepted Solution

by:
apresence earned 500 total points
ID: 33614506
Since you are going to be checking 100k e-mails, there is almost certainly going to be some duplicate lookups.  In order to optimize the lookups, I've attached a new version of the script that caches the results of the last lookup.  Should make your script run faster.

Uncomment the following like to get the cache hit information:
#print "[cached_result] ";

Testing with that line uncommented:
root@beta:~/exex/test13 $ cat emails.txt
foo
foo@bar
bar@bar
foo@barific.baz
bar@barific.baz
support@microsoft.com
abuse@microsoft.com
root@beta:~/exex/test13 $ ./validate_emails.pl <emails.txt
foo invalid
foo@bar invalid
bar@bar invalid
foo@barific.baz invalid
[cached_result] bar@barific.baz invalid
support@microsoft.com valid
[cached_result] abuse@microsoft.com valid
root@beta:~/exex/test13 $
#!/usr/bin/perl
use Data::Dumper;

%lookup_cache = ();

sub valid_address {
  my($addr) = @_;
  my($domain, $valid);

  # Lower-case address
  $addr = lc($addr);

  # Validate format of address
  return(0) unless ($addr =~ /^[^@]+@([-\w]+\.)+[a-z]{2,4}$/);

  # Grab domain
  $domain = (split(/@/, $addr))[1];

  # Lookup and return cached result if it exists
  $cached_result = $lookup_cache{$domain};
  if ($cached_result ne '')
  {
    #print "[cached_result] ";
    return $cached_result;
  }

  # Do domain lookup
  $valid = 0;
  if (open(DNS, "nslookup -q=any $domain |"))
  {
    while (<DNS>) {
      $valid = 1 if (/^$domain.*\s(mail exchanger|internet address)\s=/i);
    }
  }

  # Store cached result for later
  $lookup_cache{$domain} = $valid;

  return $valid;
}

while (<>) {
  $addy = $_;
  $addy =~ s/\s+$//;
  if ($addy)
  {
    print "$addy " . (valid_address($addy) ? 'valid' : 'invalid') . "\n";
  }
}

Open in new window

0
 

Author Comment

by:faithless1
ID: 33621650
Superb, thanks a million! I appreciate it.
0

Featured Post

ScreenConnect 6.0 Free Trial

Discover new time-saving features in one game-changing release, ScreenConnect 6.0, based on partner feedback. New features include a redesigned UI, app configurations and chat acknowledgement to improve customer engagement!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
How to find the Exchange 2007 database sizes 8 51
how to set up rsync user to run a rsync script 2 68
remove a combination of patterns from a file 15 68
powershell backup 2 124
The following is a collection of cases for strange behaviour when using advanced techniques in DOS batch files. You should have some basic experience in batch "programming", as I'm assuming some knowledge and not further explain the basics. For some…
Over the years I've spent many an hour playing on hardened, DMZ'd servers, with only a sub-set of the usual GNU toy's to keep me company; frequently I've needed to save and send log or data extracts from these server back to my PC, or to others, and…
Learn several ways to interact with files and get file information from the bash shell. ls lists the contents of a directory: Using the -a flag displays hidden files: Using the -l flag formats the output in a long list: The file command gives us mor…
A short tutorial showing how to set up an email signature in Outlook on the Web (previously known as OWA). For free email signatures designs, visit https://www.mail-signatures.com/articles/signature-templates/?sts=6651 If you want to manage em…

773 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question