Link to home
Start Free TrialLog in
Avatar of itcdr
itcdrFlag for United States of America

asked on

MultiThreaded whois app

I created a multithreaded whois program with perl. It is much faster wtih threads. The problem is that it hangs on average 1 out of ever 4 times you run the program against a list of 100 domains.


# ***** START CODE *****
#!/usr/bin/perl
use IO::Socket;
use warnings;
use threads;
use threads::shared;

#Start timer
my $before = time();

#Variables
my $tld=$ARGV[1];
my $tcount=5;
my @out : shared;
my @err : shared;

#Open domain list file and assign it to an array
open(DOM, $ARGV[0]);
my @domains = <DOM>;
close(DOM);

#Calculate domains per thread count
my $elements = @domains;
my $num = int($elements/$tcount)+1;

#Split domains into equal parts and start a new thread for each part
print "Scanning $ARGV[0] ($elements domains)...";
for(my $i=0; $i<$tcount; $i++)
{
  @{"d$i"} = splice(@domains,0,$num);
  ${"t$i"} = threads->new(\&loop, $i);
}
for(my $j=0; $j<$tcount; $j++){ ${"t$j"}->join;}

sub loop
{
  my ($x) = @_; #Get thread ID
  my $el=@{"d$x"};
  print "\n$x - start ($el domains)\n";

  #Loop through each domain
  foreach(@{"d$x"})
  {  
    chomp;
    my $dm=lc($_);
    print "( $dm $x-1)";
    my $status="UNKNOWN";

    #Query registry whois database
    $sock=IO::Socket::INET->new(Proto=>'tcp',PeerAddr=>"whois.crsnic.net:43",Timeout=>1) or error($sock,$dm,$!,"Connect");
    $sock->print("domain $dm.$tld\n") or error($sock,$dm,$!,"Print");
    my @result=$sock->getlines or error($sock,$dm,$!,"Get");
    close($sock) or error($sock,$dm,$!,"Close");
    if(scalar(@result)<1){ error($sock,$dm,"Empty","Result"); }
 
    #Scan whois results for needed info
    foreach(@result)
    {
      if(/Status: (.*)/){ $status=$1; }
    }
   
    push(@out,"$dm\t$status\n");

NEXT:
    print "( $dm $x-2)";
  }
  print "\n$x - fin\n";
}

sub error
{
  my ($sck,$dom,$er,$type) = @_;
  print "\nERROR $type: $dom --> $er\n";
  if (defined($sck)){ close($sck); }
  push(@err,"$dom\n");
  goto NEXT;
}

print "\nCleaning up...\n";

#Output results to file
open(OUT,">> out");
print OUT (@out);
close(OUT);

#Output errors to file
open(ERR,">> $tld.err");
print ERR (@err);
close(ERR);

#Cleanup
@out = ();
for(my $i=0; $i<$tcount; $i++) {@{"d$i"}=(); }

#Output completed time
my $total = time()-$before;
print "\nCompleted in $total seconds.\n\n";

# ***** END CODE *****


It seems to always freeze after 3 batches. It prints "x - fin" where x is the thread id then hangs .  It seems to show the results like this for 100 domains

0 - start (21 domains)

1 - start (21 domains)

2 - start (21 domains)

3 - start (21 domains)

4 - start (16 domains)

( domain1 3-1 ) ( domain1 3-2 )( domain2 3-1 )( domain1 3-2 ) ...
3-fin
( domain1 0-1 ) ( domain1 0-2 )( domain2 0-1 )( domain1 0-2 )...
0-fin
( domain1 2-1 ) ( domain1 2-2 )( domain2 2-1 )( domain1 2-2 ) ...
2-fin
//This is where the program hangs. I tried letting it sit for a while, but still nothing. It doesn't happen every time. I tried running the program against a file with 100 domains and on average it hangs about every 4th run. Any ideas?
Avatar of ozo
ozo
Flag of United States of America image

If your IO::Socket is older than version 1.18, you may need to set autoflush manualy
Avatar of Tintin
Tintin

What version of Perl are you using?  Remember that threads support is still reasonably new and has had various issues with older versions of Perl.
Avatar of itcdr

ASKER

[root@gateway ~]# rpm -q perl
perl-5.8.5-9
Avatar of itcdr

ASKER

How do I find what version of IO::Socket I am using?
print $IO::Socket::VERSION;
Avatar of itcdr

ASKER

print $IO::Socket::VERSION;

1.28
Avatar of itcdr

ASKER

I don't know if this helps, but I also get this error sometimes:

  Operating now in progress


The program continues and this doesn't seem to effect whether or not the program hangs on that run.
ASKER CERTIFIED SOLUTION
Avatar of ozo
ozo
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of itcdr

ASKER

I checked and it's not hanging on on the getlines statement. As you can see in the code above I have 2 print statements. One at the beginning of the for loop and one at the end. Whenever the first prints so does the second. The program always hangs after this line, which is just outside the for loop.

print "\n$x - fin\n";


The program usually hangs about 3/4 way done.
The above program seems to be waiting for
1-fin and 4-fin
So I'd be interested to see how far 1 and 4 had gotten.
Avatar of itcdr

ASKER

I'm not sure. That's all that is printed to screen. If 1 or 4 had started they should have printed atleast 1 domain, but they didn't. Any ideas?
print to a terminal device is still line buffered by default, so you probably won't see anything until it gets to the
print "\n$x - fin\n";
Avatar of itcdr

ASKER

I'm sorry. I don't know what that means. Is there a way to find out where the program is freezing by forcing a print to screen?
The simplest way to force a print to screen is to print "\n";
You could also set $|=1 before the print.
Avatar of itcdr

ASKER

I did as you said and this time the output was much different. That wsa the reason I was seeing the output in bulks. I ran it twice and both times it froze. The interesting part is both times there was and odd amount of domains printed to screen. Since I am printing each domain twice the output should be an even number of domain when it froze. I looked at the ouput and found that the batch that didn't finish (group 2)  had only 1 domain printed out ( domain 2-1 ). The second print statement is missing. So then the program is not freezing just outside the for loop as I thought. Thanks for the tip. I''ll run a few more tests and show you what I got.
Avatar of itcdr

ASKER

You were right. I tried the following as you said, but this time with $|=1 and the first statement printed but never the second. I tried it a few times and found the results were the same. I didn't see it before because I didn't use $|=1, so the first statement never printed either. Thanks again for that tip.

print "( $dm $x-1.1)\n";
my @result=$sock->getlines or error($sock,$dm,$!,"Get");
print "( $dm $x-1.2)\n";


Now that we know for sure what line the problem occurs on how do I fix it? Thank you for all your help so far.
Avatar of itcdr

ASKER

I came up with the following to try to stop the program from hanging:

eval
{
  local $SIG{ALRM} = sub{ error($sock,$dm,$!,"Get",$x) };
  alarm 5;
  my @result=$sock->getlines or error($sock,$dm,$!,"Get");
  alarm 0;
};

It seems to stop the program from hanging, but it instead exits out and prints out "Alarm Clock." How can I make it so the getlines statement will timeout without exiting the entire program? Why does the program print "Alarm Clock"?
It seems like the getlines is failing to Timeout
If it were not threaded, I might try to force it to time out by setting an alarm, but I'm not sure how threads interact with signals.
maybe try Blocking => 0 ?
I was afraid of that.  I'm not sure how to determine which thread will get the ALRM signal.
You may have to set a global $SIG{ALRM} catcher.
If you're not on MSWindows, can you try it with forks instead of threads?
 
"Alarm Clock" is printed by the shell when a process exits because of an unhandled ALRM signal. Does your "error" subroutine call "die" or one of its synonyms. Perhaps if you tried it this way:

eval {
  local $SIG{ALRM} = sub { die "Timed out" };
  alarm 5;
  @result = $sock->getlines or die "$!";
  alarm 0;
 }
error( $sock, $dm, $@, "Get") if $@;

A signal handler can't do a simple 'return'. That just counts as an unhandled signal.
Ozo's point about a global $SIG{ALRM} is also correct. The 'alarm' call can't ensure that the thread that calls it is the one that will be running when the signal arrives.

I seem to recall from earlier investigations that the timeouts on IO::Socket objects only apply to the setting up of the connection. The send and receive operations will not themselves time out based on the timeout supplied when the socket object was created.




Avatar of itcdr

ASKER

1. I tried Blocking => 0 as
$sock=IO::Socket::INET->new(Proto=>'tcp',PeerAddr=>"$remote:43",Timeout=>1,Blocking=>0) or error($sock,$dm,$!,"Connect",$x);
The were no errors or hanging, but there was also no ouput at the end. The print statement showed that everything was going fine, but for some reason the @result=$sock->getlines() came out empty for every domain. If I remove Blocking=>0 then it goes back to how it was freezing every once in a while, but with an output.

2. I tried the suggestion about the alarm, but the same thing happened. 5 seconds after the first error the whole program exits.

3. I am using Linux. Can you explain what you mean by using a forks instead of threads? What is the performance/stability difference between the 2 methods?


Any Ideas? How can I do add a timeout to the getlines without exiting out of the whole program. I just want to skip that domain and come back to it at the end as I do with domains that timeout on the connect statement.
Instead of using multiple threads with one connection each, how about doing multiple connects on one thread, and using select to wait for whichever socket responds first?


#!/usr/bin/perl
use IO::Socket;
use strict;
use warnings;

#Start timer
my $before = time();

#Variables
my $tld=$ARGV[1];
my $tcount=5;
my @out : shared;
my @err : shared;

#Open domain list file and assign it to an array
open(DOM, $ARGV[0]);
my @domains = <DOM>;
close(DOM);

#Calculate domains per thread count
my $elements = @domains;
my $num = int($elements/$tcount)+1;

#Split domains into equal parts and start a new thread for each part
my @d;
my @t;
my @sock;
my @id;
my $rin='';
print "Scanning $ARGV[0] ($elements domains)...";
for(my $i=0; $i<$tcount; $i++){
  @{$d[$i]} = splice(@domains,0,$num);
  my $el=@{$d[$i]};
  print "\n$i - start ($el domains)\n";
  start_query($i);
}
while( $rin && (my $nfound = select my$rout=$rin,undef,undef,1) > 0 ){  
    my $x;
    my $fileno=0;
    while( !defined $x ){
      $x = $id[--$fileno] if vec($rout,$fileno++,1);
    }    
    undef $id[$fileno];
    vec($rin,$fileno,1) = 0;
    my $dm = lc($d[$x][0]);
    my $sock = $sock[$x];
    my $result='';
    #print STDERR "$dm $x-1.1\n";
    1 while sysread($sock,$result,1024,length $result);
    #print STDERR "$m $x-1.2\n";
    eval{ close($sock) or error($sock,$dm,$!,"Close"); };
#do you really need to close the socket and re-open it before doing another query?
    eval{
      if( $result eq '' ){ error($sock,$dm,"Empty","Result"); }
       #Scan whois results for needed info
      my($status) =  $result =~ /Status: (.*)/;
      $status ||= 'UNKNOWN';
      push(@out,"$dm\t$status\n");
    };
    print "( $dm $x-2)";

    shift @{$d[$x]};
    if( @{$d[$x]} ){
        start_query($x);
    }else{
       print "\n$x - fin\n";
    }
}
sub start_query{
  my ($x) = @_; #Get thread ID
  for( $d[$x][0] ){  
    chomp;
    my $dm=lc($_);
    print "( $dm $x-1)";
    my $status="UNKNOWN";
    #Query registry whois database
    eval{
      $sock[$x]=IO::Socket::INET->new(Proto=>'tcp',PeerAddr=>"whois.crsnic.net:43",Timeout=>1) or error($sock[$x],$dm,$!,"Connect");
      $sock[$x]->print("domain $dm.$tld\n") or error($sock[$x],$dm,$!,"Print");
      $id[fileno $sock[$x]] = $x;
      vec($rin,fileno($sock[$x]),1) = 1;
   };
  }
}

sub error{
  my ($sck,$dom,$er,$type) = @_;
  print "\nERROR $type: $dom --> $er\n";
  if (defined($sck)){ close($sck); }
  push(@err,"$dom\n");
  die;
}

print "\nCleaning up...\n";

#Output results to file
open(OUT,">> out");
print OUT (@out);
close(OUT);

#Output errors to file
open(ERR,">> $tld.err");
print ERR (@err);
close(ERR);

#Cleanup
@out = ();
for(my $i=0; $i<$tcount; $i++) {@{$d[$i]}=(); }

#Output completed time
my $total = time()-$before;
print "\nCompleted in $total seconds.\n\n";  
Avatar of itcdr

ASKER

I tried the code you gave me and it froze sometimes on the following line just as my code froze on the getlines.

1 while sysread($sock,$result,1024,length $result);

Also, that way took much longer than the threaded version. Any ideas on how to add a timeout to the getlines() without exitting the entire program?
Are you trying without multithreading?
What does this print when it froze?

    print STDERR "$nfound ",pack('b*',$rout)," ",fileno($sock)," $fileno $dm $x-1.1\n";
    print STDERR length $result,"\n" while sysread($sock,$result,1024,length $result);
    print STDERR "$dm $x-1.2\n";

It may be faster to change it to keep the $sock connected instead of doing another close/new for each query
Avatar of itcdr

ASKER

Thanks ozo for your help on the other question about Non-Blocking sockets. That also solved most of my problems on this question. Now the program no longer freezes. I have just a couple more problems.

1. I receieve the following error quite often. I looked it up and microsoft has the following definition:

"Operation now in progress"
WinSock only allows a single blocking operation to be outstanding per task (or thread), and if you make any other function call (whether or not it references that or any other socket) the function fails with the WSAEINPROGRESS error. It means that there is a blocking operation outstanding. It is also possible that WinSock might return this error after an application calls connect() a second time on a non-blocking socket while the connection is pending (after the first connection failed with WSAEWOULDBLOCK).

Since I am now using non-blocking sockets it would seem the reason for the error would be the following:
"an application calls connect() a second time on a non-blocking socket while the connection is pending"

Any ideas on how to resolve this issue? I thought that my program closes the connection before moving on to the next domain.

2. You make a good point that it would be better to try to keep the connection. I don't quite understand your code. Can you explain how it works?


ozo, thank you for your awesome help so far.
Avatar of itcdr

ASKER

No, please don't close this question. The answer to the related question did indeed help this one, but there is still a couple issues that are not solved.
whois.crsnic.net is returning a response that says

TERMS OF USE: You are not authorized to access or query our Whois
database through the use of electronic processes that are high-volume and
automated except as reasonably necessary to register domain names or
modify existing registrations;

I'm not sure we should be writing a program to flood it with requests like this.
Avatar of itcdr

ASKER

I wanted to create this program to check a domain's current status, similar to

1. DomainInspect - http://www.antssoft.com/domaininspect/
2. Domain Name Analyzer - http://www.domainpunch.com/products/dnapro/

Most registrar's Whois servers put a daily limit on queries. For example, Network Solutions (whois.networksolutions.com) allows up to 1,000 queries per day before temporary blocking your IP. So then it shouldn't be a problem to query 400-500 domains against that whois server. I am not even placing a query on the registrar. I am only doing it to the registry to get the status. Also a part I left out of the program was that I randomly pick whois server on each query. That way there shouldn't be any problems.
Avatar of itcdr

ASKER

According to the terms of use I am not allowed to query the whois database through the use of electronic processes that are high-volume. However, they do not state what is considered high-volume. I emailed the registry and asked them, but they have not responded. The registrar, Network Solutions, considers high-volume to be over 1,000 queries per day. The registry database gets many more queries than the registrar's database. For example, the command-line whois program that is shipped with linux has to first query the registry before it can figure out which registrar to query. So if the registrar considers high-volume to be over 1,000 per day, then I should have no problem performing under 1,000 queries against the registry, especially since I randomly pick from 3 different registry servers to query each time.

Does that work? If you still think I am breaking the rules, I can try to contact the registry again to find what they consider high-volume, but I'm not sure how long it would take to get an answer. What do you think?
Avatar of itcdr

ASKER

1. I read the an article at http://www.whoisview.com/articles/highspeed.php, which is where I found that Network Solutions limits you to 1,000 lookups per day.

2. That same article states the following:
"Verisign/Netsol provides bulk access to the WHOIS data through a license agreement with them. For information, send an e-mail message to bulkwhois@netsol.com."
I tried emailing that address, bulkwhois@netsol.com, about a month ago and again today, but no reply.

3. I also read the FAQ's at http://www.domainpunch.com/products/dna/docs/faq.php, which states there are the following registry whois servers. These are the servers I planned to randomly choose when placing a query.

whois.crsnic.net
rs.internic.net
whois.nsiregistry.net

4. Because I did not receive a reply from bulkwhois@netsol.com I decided to look for another contact.
*Both http://crsnic.net and http://nsiregistry.net point to Verisign Naming and Directory Services.
*rs.internic.net goes to it's own website, but there is no contact information. However it does say, "Results for .com and .net are provided courtesy of Verisign Global Registry Services" Just as I thought the registry for .com/.net, Verisign, controls all 3 whois servers. I went to the whois help secion of verisign at http://registrar.verisign-grs.com/whois/help.html. At the bottom of the page it states, "For general questions, comments and suggestions, or bug reports send email to info@verisign-grs.com."


So it seems the contact email for all 3 whois servers is info@verisign-grs.com. I just emailed them about an hour ago. You are welcome to do the same. If you do and get answer before I do, please post it. I'm sorry if I have caused any trouble.
Avatar of itcdr

ASKER

I just received a reply:

Dear Chris,
Thank you for contacting VeriSign Customer Service.

VeriSign and Network Solutions are two separate entities and owned individually.  If you would like information on the Network Solutions WHOIS tool, please contact them directly.  There are many other servers that perform WHOIS queries, but we do not recommend any particular site. You may use our WHOIS tool located at www.verisign-grs.com if you would
like to query the database.  However we do not give out what the limits are.  If someone does reach the limit our Operations Team will restrict their access.

Best Regards,
Audra Johnson
Customer Service
VeriSign, Inc.
www.verisign.com
info@verisign-grs.com
Avatar of itcdr

ASKER

So I guess that we won't know the limit until it's too late.
Avatar of itcdr

ASKER

I'm sorry I don't quite understand what you mean by "You would also have the operational capacity of your program subject to unknown and untimely external influences. not optimal in any case."

I replied to the email with a few more questions about it. I'll let you know what I come up with. However, it doesn't look like we we'll find out the actuall daily limit, but since I have never been banned it would seem that I have not exceeded any limit. What do you think?
Avatar of itcdr

ASKER

I understand that the registry has the power to turn off my access.

Considering I have not been banned, doesn't that mean I haven't broke any rules?

Also, since the registry knows when I am placing a query and has the power to turn off my access, doesn't that mean I am not using any hacking techniques?

If both are true than wouldn't it be alright to ask for help to create this program? Many other companies have created and sold domain checker software that queries the registry in the same matter. If they have never been accused of any wrong doing, then shouldn't I be ok?
Avatar of itcdr

ASKER

I just received another reply from Verisign. I asked them about using the different whois servers. They replied with the following:

Dear Chris,

They are linked to the same database.  We do recommend that people connect to the server name specifically and not directly to the IP because we reserve the right to update the IP when we perform maintenance on the servers.  By breaking up your queries and hitting three servers it should help to keep you from performing too many queries to one machine.



So then that was a good idea to randomly choose the whois server. What's the final verdict?
Avatar of itcdr

ASKER

Fantastic, so then we can continue with my last couple of problems:

1. I receieve the following error quite often. I looked it up and microsoft has the following definition:

"Operation now in progress"
WinSock only allows a single blocking operation to be outstanding per task (or thread), and if you make any other function call (whether or not it references that or any other socket) the function fails with the WSAEINPROGRESS error. It means that there is a blocking operation outstanding. It is also possible that WinSock might return this error after an application calls connect() a second time on a non-blocking socket while the connection is pending (after the first connection failed with WSAEWOULDBLOCK).

Since I am now using non-blocking sockets it would seem the reason for the error would be the following:
"an application calls connect() a second time on a non-blocking socket while the connection is pending"

Any ideas on how to resolve this issue? I thought that my program closes the connection before moving on to the next domain.

2. You make a good point that it would be better to try to keep the connection. I don't quite understand your code. Can you explain how it works? How are you able to accomplish not closing the connection every time? I thought that the connection had to be closed and reopen again to perform another query. Can you explain the theory behind this?
1.  Sorry, I don't know much about WinSock, and the WinSock machine I would have triesd testing it on has not recovered from a power failure.

2.  I was expecting a server designed for multiple querys, in which case it would make sense for it to keep the connection open while you send different commands, but when I tried it I found that the server closed the connection after the first response.  
That, and the text of the response made me wonder if maybe they didn't want me making all the queries I had been making.
The above code still closes and reopens just as the original code does.
Selecting a different server for each query is one of the things I would have suggested if you weren't already doing it, and it also provides a more useful error recovery option on a failed connect than just skipping that domain.
Avatar of itcdr

ASKER

Thanks for all your help.

To conclude:
1. Thanks to ozo's suggestion to use $|=1 I was able to determine where the problem was.
2. The problem was on the getlines() line like ozo thought.
3. I fixed the problem by switching to non-blocking sockets as ozo suggested.