duncanb7
asked on
LoadHTMLFile() loading time in PHP
Dear Expert,
in my following php code I do automation for grab data from 10 webpage at the same website,
sometimes, I get hangup and I guess the reason url link is busy so let loadHTMLfile() keep waiting.
it takes for a really long ocassionlly such as 15-30 minutes. the webpage size is really small as 30k.
Why my php code of loadHTMLFile() is no timing limit and fall back to next code once the time waiting
is too long.
Do you have any method to timer the waiting time in php code, once the time is expired and then
go to next line of code in same php program ?
In VBA I will use this but the code in the loop is never in idle and never waiting forever so it will be okay
but for php's loadHTMLFile() that won't work if using similar while loop . Any suggestion, please advise
VBA code to set timer for waiting
=======================
a=Time()
Do until TimeValue(Time()) - TimeValue(a) > TimeValue("00:01:00")
'code here
Loop
in my following php code I do automation for grab data from 10 webpage at the same website,
sometimes, I get hangup and I guess the reason url link is busy so let loadHTMLfile() keep waiting.
it takes for a really long ocassionlly such as 15-30 minutes. the webpage size is really small as 30k.
Why my php code of loadHTMLFile() is no timing limit and fall back to next code once the time waiting
is too long.
Do you have any method to timer the waiting time in php code, once the time is expired and then
go to next line of code in same php program ?
In VBA I will use this but the code in the loop is never in idle and never waiting forever so it will be okay
but for php's loadHTMLFile() that won't work if using similar while loop . Any suggestion, please advise
VBA code to set timer for waiting
=======================
a=Time()
Do until TimeValue(Time()) - TimeValue(a) > TimeValue("00:01:00")
'code here
Loop
<?php
For ($k=0; $k <4; ++$k) { //repeat loadhtmlfile() over again if one of $c is not loaded
try {
for ($c=0; $c < 10; ++$c) {
$url= 'http://www.othersite.com/ex.aspx?symbol='.$c;
$dom = new DOMDocument();
$dom->loadHTMLFile($url);
echo "start=".$c;
$data= $dom->getElementsByTagName('table')->item(2)>nodeValue;
echo "Sucess to pass getElementsByTagname".$c;
}
$k=4;//It means no need to do re-do since no fatal error during loadHTMlfile for 10 pages
}
catch(err) {
echo "It found error at =" %c;
}
}
?>
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
You mean use curl to get html file instead of loadHTMLfile(), Right ?
Yes, you would use the function to get the HTML file into a string variable in your script. Then you could store the string on your own server and process it there. There would be no further delay because of HTTP or remote server issues.
The script I posted above has an example of the use case (actually 2 examples). These start on line 85.
The script I posted above has an example of the use case (actually 2 examples). These start on line 85.
ASKER
Coclusion:
1-Timeout is working fine and error message coming once timeout
2- $arg arrary is on fire but on always on wrong webpage probably it is casued by url code issue
for my case I don't know why, so and I need to set as follows using http_build_query()
$url = array('Symbol'=> $c);
my_curl("http://www...../test.aspx?&".http_build_quer y($url),5, True);
and then it works fine exactly,
3- The speed excution time is 40% faster using curl methond than loadHTMLFile() by estimsation.
Duncan
Taking out $arg arrary code and function input
========================== ========== =
function my_curl
( $url
/////////////////, $get_array=array()
, $timeout=3
, $error_report=TRUE
)
// PREPARE THE ARGUMENT STRING IF NEEDED
// $get_string = '';
// foreach ($get_array as $key => $val)
// {
// $get_string
// = $get_string
//. urlencode($key)
//. '='
//. urlencode($val)
//. '&';
// }
// $get_string = rtrim($get_string, '&');
// if (!empty($get_string)) $url .= '?' . $get_string;
1-Timeout is working fine and error message coming once timeout
2- $arg arrary is on fire but on always on wrong webpage probably it is casued by url code issue
for my case I don't know why, so and I need to set as follows using http_build_query()
$url = array('Symbol'=> $c);
my_curl("http://www...../test.aspx?&".http_build_quer
and then it works fine exactly,
3- The speed excution time is 40% faster using curl methond than loadHTMLFile() by estimsation.
Duncan
Taking out $arg arrary code and function input
==========================
function my_curl
( $url
/////////////////, $get_array=array()
, $timeout=3
, $error_report=TRUE
)
// PREPARE THE ARGUMENT STRING IF NEEDED
// $get_string = '';
// foreach ($get_array as $key => $val)
// {
// $get_string
// = $get_string
//. urlencode($key)
//. '='
//. urlencode($val)
//. '&';
// }
// $get_string = rtrim($get_string, '&');
// if (!empty($get_string)) $url .= '?' . $get_string;
ASKER
Thanks for your reply.
The code is help a lot
The code is help a lot
ASKER
How to ue the function, where I should put ?
Where I can set the timeout time for example, 60s only? Where $timeout=3, means 3 minuts ?