Link to home
Start Free TrialLog in
Avatar of duncanb7
duncanb7

asked on

LoadHTMLFile() loading time in PHP

Dear Expert,

in my following php code I do automation for grab data from 10 webpage  at the same website,
 sometimes, I get hangup and I guess the reason url link is busy so let loadHTMLfile() keep waiting.
it takes for a really long ocassionlly such as  15-30 minutes. the webpage size is really small as 30k.
Why my php code of loadHTMLFile()  is no timing limit and fall back to next code once the time waiting
is too long.
Do you have any method to timer the waiting time in php code, once the time is expired  and then
go to next  line of code in same php program ?
In VBA I will use this but the code in the loop is never in idle and never  waiting forever so it will be okay
but for php's loadHTMLFile() that won't work if using similar while loop . Any suggestion, please advise

VBA code to set timer for waiting
=======================
a=Time()
Do until   TimeValue(Time()) - TimeValue(a) > TimeValue("00:01:00")
'code here
Loop




<?php
For ($k=0; $k <4; ++$k)  {  //repeat loadhtmlfile() over again if one of $c is not loaded
try {
for ($c=0; $c < 10; ++$c) {
$url= 'http://www.othersite.com/ex.aspx?symbol='.$c;
$dom = new DOMDocument();
$dom->loadHTMLFile($url);
echo "start=".$c;
$data= $dom->getElementsByTagName('table')->item(2)>nodeValue;
echo "Sucess to pass getElementsByTagname".$c;
}
$k=4;//It means no need to do re-do since no fatal error during loadHTMlfile for 10 pages
}
catch(err) {
echo "It found error at =" %c;
}
}
?>

Open in new window

ASKER CERTIFIED SOLUTION
Avatar of Ray Paseur
Ray Paseur
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of duncanb7
duncanb7

ASKER

No never

How to ue the function, where I should  put ?
Where I can set the timeout time for example, 60s only? Where $timeout=3, means 3 minuts ?
You mean use curl to get html file instead of loadHTMLfile(), Right ?
Yes, you would use the function to get the HTML file into a string variable in your script.  Then you could store the string on your own server and process it there.  There would be no further delay because of HTTP or remote server issues.

The script I posted above has an example of the use case (actually 2 examples).  These start on line 85.
Coclusion:
1-Timeout is working fine and error message coming once timeout
2- $arg arrary is on fire but on always on wrong webpage probably it is casued by url code issue
for my case I don't know why, so  and I need to set as follows using http_build_query()
$url = array('Symbol'=> $c);
my_curl("http://www...../test.aspx?&".http_build_query($url),5,True);

and then it works fine exactly,
3- The speed excution time is 40% faster using curl methond than loadHTMLFile() by estimsation.

Duncan


Taking out $arg arrary code and function input
=====================================
function my_curl
( $url
/////////////////, $get_array=array()
, $timeout=3
, $error_report=TRUE
)

// PREPARE THE ARGUMENT STRING IF NEEDED
  //  $get_string = '';
   // foreach ($get_array as $key => $val)
   // {
     //   $get_string
       // = $get_string
        //. urlencode($key)
        //. '='
        //. urlencode($val)
        //. '&';
   // }
   // $get_string = rtrim($get_string, '&');
   // if (!empty($get_string)) $url .= '?' . $get_string;
Thanks for your reply.
The code is help a lot