Solved

multiple parallel remote connections with fsockopen or fopen

Posted on 2004-08-16
6
1,985 Views
Last Modified: 2008-01-09
Hello.

Below are two methods that can be used to find if a remote file exsist and is valid:

         <?php
         // Thanks, basiclife
         $fp = fsockopen("www.example.com", 80);
         if (!$fp) {
             echo "Unable to open\n";
         } else {
             fwrite($fp, "GET / HTTP/1.0\n\n");
             stream_set_timeout($fp, 2);
             $res = fread($fp, 2000);
             var_dump(stream_get_meta_data($fp));
             fclose($fp);
             echo $res;
         }
         ?>  

and

         <?php
         // Thanks,  hernst42
         function validate_remote ($url)
         {
              if ($fp = @fopen($url, "r")) {
              return "<font color=green>OK</font>";
              } else {
              return "<font color=red>Error</font>";
         }
         ?>

Both seem to be extremely slow, when validating 30 links, 10% bad.

At first I thought it might be a memory issue, but the problem actually seems to be that each URL is tested "single file" in a series.
So, if 10 links are bad, and the script spends 30 seconds waiting for a response from each bad script, 300 seconds are wasted.

Is there a way to do this in "parallel"?  In other words, can I open up multiple connections with either fsockopen or fopen?  

This could cut the waiting time from 300 seconds to 45 seconds.  I know it can be done with Perl, but I have to use PHP here.

Thanks!
0
Comment
Question by:hankknight
6 Comments
 
LVL 4

Expert Comment

by:Skonen
ID: 11816169
You could always just set the connection timeout:

$timeout = 5; // 5 second timeout

if (!$fp = fsockopen("www.example.com", 80, $error_num, $error_str, $timeout)) {
  //error
}
else {
  //success
}

//Stuart Konen

0
 
LVL 3

Assisted Solution

by:KarveR
KarveR earned 50 total points
ID: 11816540
You need to validate against a page, the connection will return true if the server is alive.

<?php

$site="www.google.com";
$page="/index.html";

$timeout=5;

            $fp = fsockopen($site, 80 , $errno, $errstr, $timeout);

            if (!$fp) {

                  echo "Unable to open\n";

            } else {

                  fwrite($fp, "GET $page HTTP/1.0\n\n");

                  $res = explode(' ',fread($fp, 28));

                  fclose($fp);

                  switch($res[1]){
                  // pretty much anything other than 200 and your connection barfed

                        case 200 :
                              echo "<font color=green>OK</font><br>";
                        break;

                        case 401 :
                              echo "<font color=red>FAILED</font><br>";
                        break;

                        case 402 :
                              echo "<font color=red>FAILED</font><br>";
                        break;

                        case 403 :
                              echo "<font color=red>FAILED</font><br>";
                        break;

                        case 404 :
                              echo "<font color=red>FAILED</font><br>";
                        break;
                  }


         }

?>

//karv
0
 
LVL 9

Assisted Solution

by:_GeG_
_GeG_ earned 50 total points
ID: 11818244
<?php
 $refs=array('www.example1.com', 'www.example2.com', 'www.example3.com','www.example4.com');
$maxwait=10; //seconds
foreach($refs as $i=>$ref){
    $fp[$i] = fsockopen($ref, 80);
    if (!$fp[$i]){
        echo "Unable to open $ref\n";
        unset($fp[$i]);
    } else {
        stream_set_blocking($fp[$i], false);
        stream_set_timeout($fp, $maxwait);
        fwrite($fp[$i], "GET / HTTP/1.0\n\n");
    }
} // now all servers are connected
for ($wait=0; $wait<$maxwait && count($fp); $wait++){
    foreach($fp as $i=>$f){
        $result=fread($f, 2000);
        if (!empty($result)){
             //use KarveR's method for checking
             //...
             fclose($f);
             unset($fp[$i]);
         }
    }
    sleep(1);
}
foreach ($fp as $i=>$f){
    echo "{$refs[$i]} could not be connected\n";
}
?>

you will need php4.3 for that, otherwise you need to use the older nonblocking functions. I didn't try to run it, that's just the way I imagine it would work ;)
0
Announcing the Most Valuable Experts of 2016

MVEs are more concerned with the satisfaction of those they help than with the considerable points they can earn. They are the types of people you feel privileged to call colleagues. Join us in honoring this amazing group of Experts.

 
LVL 25

Accepted Solution

by:
Marcus Bointon earned 400 total points
ID: 11819678
Only problem with this approach is that it's not in the least bit unreasonable for a site to take > 30 sec to respond, so it almost guarantees bad results for your impatience! You really need to do this properly in parallel. CURL allows you to do this, and there's an example of how to do it on this page:

http://www.php.net/manual/en/function.curl-multi-exec.php

You can use CURL to set off a whole load of connections at once and you can either wait for status back from them all (as the example does), or make use of callback options to get individual callbacks for each connection separately.
0
 
LVL 9

Expert Comment

by:_GeG_
ID: 11821752
@squinky:
in my code, you can set $maxwait as high as you need. btw I think 30 second for the status response for a http get is way too long.

but I agree if curl is included in your php, that's the way to go
0
 
LVL 25

Expert Comment

by:Marcus Bointon
ID: 11822276
Yes, I agree that 30 sec is high, but it's not beyond the realms of possibility (and sure, they should switch hosting providers!). My point is really that reducing the timeout period just to speed up the interrogation process is just asking for bad data. With the CURL technique, the total time you should have to wait for any number of connections (within reason) is the timeout period, plus overhead, and a slow response from one server does not affect any other connections. If you're posting the results of this process to a database, the CURL approach will let you post the results of fast responses immediately, even if earlier slow servers have not come back yet. It's also possible to allow your scanner to tune its responses, and set a timeout period separately for every server.
0

Featured Post

Live: Real-Time Solutions, Start Here

Receive instant 1:1 support from technology experts, using our real-time conversation and whiteboard interface. Your first 5 minutes are always free.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This article will explain how to display the first page of your Microsoft Word documents (e.g. .doc, .docx, etc...) as images in a web page programatically. I have scoured the web on a way to do this unsuccessfully. The goal is to produce something …
Password hashing is better than message digests or encryption, and you should be using it instead of message digests or encryption.  Find out why and how in this article, which supplements the original article on PHP Client Registration, Login, Logo…
The viewer will learn how to dynamically set the form action using jQuery.
The viewer will learn how to create and use a small PHP class to apply a watermark to an image. This video shows the viewer the setup for the PHP watermark as well as important coding language. Continue to Part 2 to learn the core code used in creat…

813 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

10 Experts available now in Live!

Get 1:1 Help Now