MihaiAndrei
asked on
Calling pages and moving on without waiting for the page to complete
Hello
I created a crawler that parse huge amounts of XML files and insert/update data in a database. I have uploaded this crawler on multiple servers and I want to call all of them from the main project.
In the main project I have something like:
file_get_contents("http://crawler1.net");
file_get_contents("http://crawler2.net");
file_get_contents("http://crawler3.net");
It calls crawler1, wait until it finish running, and then call crawler2.
What I want to achieve is to call crawler1, and then call crawler2 right away without waiting for crawler1 to finish loading.
I do not need anything from what a crawler outputs, they all work independently and insert/update data in the same database. All I need is to call them so they start crawling.
I created a crawler that parse huge amounts of XML files and insert/update data in a database. I have uploaded this crawler on multiple servers and I want to call all of them from the main project.
In the main project I have something like:
file_get_contents("http://crawler1.net");
file_get_contents("http://crawler2.net");
file_get_contents("http://crawler3.net");
It calls crawler1, wait until it finish running, and then call crawler2.
What I want to achieve is to call crawler1, and then call crawler2 right away without waiting for crawler1 to finish loading.
I do not need anything from what a crawler outputs, they all work independently and insert/update data in the same database. All I need is to call them so they start crawling.
ASKER
Can you please explain me what does CURLOPT_NOBODY actually do ?
CURLOPT_NOBODY will not return any output from the pages you open.
There is a detailed list with options at
http://www.php.net/manual/en/function.curl-setopt.php
There is a detailed list with options at
http://www.php.net/manual/en/function.curl-setopt.php
ASKER
Well I already tried using CURL, but without setting CURLOPT_NOBODY to true.
Even if I was setting the timeout to 5 seconds, the script was still loading the page more than 5 seconds. This was logical though, timeout occurs only when the requested page does not respond for 5 seconds.
Will setting CURLOPT_NOBODY to true change this and make the script work as I intend ?
Even if I was setting the timeout to 5 seconds, the script was still loading the page more than 5 seconds. This was logical though, timeout occurs only when the requested page does not respond for 5 seconds.
Will setting CURLOPT_NOBODY to true change this and make the script work as I intend ?
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Open in new window