?
Solved

multiple parallel remote connections with fsockopen or fopen

Posted on 2004-08-16
6
Medium Priority
?
2,039 Views
Last Modified: 2008-01-09
Hello.

Below are two methods that can be used to find if a remote file exsist and is valid:

         <?php
         // Thanks, basiclife
         $fp = fsockopen("www.example.com", 80);
         if (!$fp) {
             echo "Unable to open\n";
         } else {
             fwrite($fp, "GET / HTTP/1.0\n\n");
             stream_set_timeout($fp, 2);
             $res = fread($fp, 2000);
             var_dump(stream_get_meta_data($fp));
             fclose($fp);
             echo $res;
         }
         ?>  

and

         <?php
         // Thanks,  hernst42
         function validate_remote ($url)
         {
              if ($fp = @fopen($url, "r")) {
              return "<font color=green>OK</font>";
              } else {
              return "<font color=red>Error</font>";
         }
         ?>

Both seem to be extremely slow, when validating 30 links, 10% bad.

At first I thought it might be a memory issue, but the problem actually seems to be that each URL is tested "single file" in a series.
So, if 10 links are bad, and the script spends 30 seconds waiting for a response from each bad script, 300 seconds are wasted.

Is there a way to do this in "parallel"?  In other words, can I open up multiple connections with either fsockopen or fopen?  

This could cut the waiting time from 300 seconds to 45 seconds.  I know it can be done with Perl, but I have to use PHP here.

Thanks!
0
Comment
Question by:hankknight
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
6 Comments
 
LVL 4

Expert Comment

by:Skonen
ID: 11816169
You could always just set the connection timeout:

$timeout = 5; // 5 second timeout

if (!$fp = fsockopen("www.example.com", 80, $error_num, $error_str, $timeout)) {
  //error
}
else {
  //success
}

//Stuart Konen

0
 
LVL 3

Assisted Solution

by:KarveR
KarveR earned 200 total points
ID: 11816540
You need to validate against a page, the connection will return true if the server is alive.

<?php

$site="www.google.com";
$page="/index.html";

$timeout=5;

            $fp = fsockopen($site, 80 , $errno, $errstr, $timeout);

            if (!$fp) {

                  echo "Unable to open\n";

            } else {

                  fwrite($fp, "GET $page HTTP/1.0\n\n");

                  $res = explode(' ',fread($fp, 28));

                  fclose($fp);

                  switch($res[1]){
                  // pretty much anything other than 200 and your connection barfed

                        case 200 :
                              echo "<font color=green>OK</font><br>";
                        break;

                        case 401 :
                              echo "<font color=red>FAILED</font><br>";
                        break;

                        case 402 :
                              echo "<font color=red>FAILED</font><br>";
                        break;

                        case 403 :
                              echo "<font color=red>FAILED</font><br>";
                        break;

                        case 404 :
                              echo "<font color=red>FAILED</font><br>";
                        break;
                  }


         }

?>

//karv
0
 
LVL 9

Assisted Solution

by:_GeG_
_GeG_ earned 200 total points
ID: 11818244
<?php
 $refs=array('www.example1.com', 'www.example2.com', 'www.example3.com','www.example4.com');
$maxwait=10; //seconds
foreach($refs as $i=>$ref){
    $fp[$i] = fsockopen($ref, 80);
    if (!$fp[$i]){
        echo "Unable to open $ref\n";
        unset($fp[$i]);
    } else {
        stream_set_blocking($fp[$i], false);
        stream_set_timeout($fp, $maxwait);
        fwrite($fp[$i], "GET / HTTP/1.0\n\n");
    }
} // now all servers are connected
for ($wait=0; $wait<$maxwait && count($fp); $wait++){
    foreach($fp as $i=>$f){
        $result=fread($f, 2000);
        if (!empty($result)){
             //use KarveR's method for checking
             //...
             fclose($f);
             unset($fp[$i]);
         }
    }
    sleep(1);
}
foreach ($fp as $i=>$f){
    echo "{$refs[$i]} could not be connected\n";
}
?>

you will need php4.3 for that, otherwise you need to use the older nonblocking functions. I didn't try to run it, that's just the way I imagine it would work ;)
0
Video: Liquid Web Managed WordPress Comparisons

If you run run a WordPress, you understand the potential headaches you may face when updating your plugins and themes. Do you choose to update on the fly and risk taking down your site; or do you set up a staging, keep it in sync with your live site and use that to test updates?

 
LVL 25

Accepted Solution

by:
Marcus Bointon earned 1600 total points
ID: 11819678
Only problem with this approach is that it's not in the least bit unreasonable for a site to take > 30 sec to respond, so it almost guarantees bad results for your impatience! You really need to do this properly in parallel. CURL allows you to do this, and there's an example of how to do it on this page:

http://www.php.net/manual/en/function.curl-multi-exec.php

You can use CURL to set off a whole load of connections at once and you can either wait for status back from them all (as the example does), or make use of callback options to get individual callbacks for each connection separately.
0
 
LVL 9

Expert Comment

by:_GeG_
ID: 11821752
@squinky:
in my code, you can set $maxwait as high as you need. btw I think 30 second for the status response for a http get is way too long.

but I agree if curl is included in your php, that's the way to go
0
 
LVL 25

Expert Comment

by:Marcus Bointon
ID: 11822276
Yes, I agree that 30 sec is high, but it's not beyond the realms of possibility (and sure, they should switch hosting providers!). My point is really that reducing the timeout period just to speed up the interrogation process is just asking for bad data. With the CURL technique, the total time you should have to wait for any number of connections (within reason) is the timeout period, plus overhead, and a slow response from one server does not affect any other connections. If you're posting the results of this process to a database, the CURL approach will let you post the results of fast responses immediately, even if earlier slow servers have not come back yet. It's also possible to allow your scanner to tune its responses, and set a timeout period separately for every server.
0

Featured Post

New feature and membership benefit!

New feature! Upgrade and increase expert visibility of your issues with Priority Questions.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Developers of all skill levels should learn to use current best practices when developing websites. However many developers, new and old, fall into the trap of using deprecated features because this is what so many tutorials and books tell them to u…
Since pre-biblical times, humans have sought ways to keep secrets, and share the secrets selectively.  This article explores the ways PHP can be used to hide and encrypt information.
This tutorial will teach you the core code needed to finalize the addition of a watermark to your image. The viewer will use a small PHP class to learn and create a watermark.
The viewer will learn how to create a basic form using some HTML5 and PHP for later processing. Set up your basic HTML file. Open your form tag and set the method and action attributes.: (CODE) Set up your first few inputs one for the name and …
Suggested Courses

770 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question