Solved

Calling pages and moving on without waiting for the page to complete

Posted on 2007-11-28
5
242 Views
Last Modified: 2012-05-05
Hello

I created a crawler that parse huge amounts of XML files and insert/update data in a database. I have uploaded this crawler on multiple servers and I want to call all of them from the main project.

In the main project I have something like:
file_get_contents("http://crawler1.net");
file_get_contents("http://crawler2.net");
file_get_contents("http://crawler3.net");

It calls crawler1, wait until it finish running, and then call crawler2.

What I want to achieve is to call crawler1, and then call crawler2 right away without waiting for crawler1 to finish loading.

I do not need anything from what a crawler outputs, they all work independently and insert/update data in the same database. All I need is to call them so they start crawling.
0
Comment
Question by:MihaiAndrei
  • 3
  • 2
5 Comments
 
LVL 20

Expert Comment

by:steelseth12
ID: 20365060
This will call each crawler at 5 sec intervals.
$ch = curl_init();

curl_setopt($ch, CURLOPT_URL, "http://crawler1.net");

curl_setopt($ch, CURLOPT_NOBODY, true);

curl_setopt($ch, CURLOPT_TIMEOUT, 5);

curl_exec($ch);

curl_close($ch);
 
 

$ch = curl_init();

curl_setopt($ch, CURLOPT_URL, "http://crawler2.net");

curl_setopt($ch, CURLOPT_NOBODY, true);

curl_setopt($ch, CURLOPT_TIMEOUT, 5);

curl_exec($ch);

curl_close($ch);
 
 
 

$ch = curl_init();

curl_setopt($ch, CURLOPT_URL, "http://crawler3.net");

curl_setopt($ch, CURLOPT_NOBODY, true);

curl_setopt($ch, CURLOPT_TIMEOUT, 5);

curl_exec($ch);

curl_close($ch);

Open in new window

0
 

Author Comment

by:MihaiAndrei
ID: 20372470
Can you please explain me what does CURLOPT_NOBODY actually do ?
0
 
LVL 20

Expert Comment

by:steelseth12
ID: 20372658
CURLOPT_NOBODY will not return any output from the pages you open.

There is a detailed list with options at

http://www.php.net/manual/en/function.curl-setopt.php
0
 

Author Comment

by:MihaiAndrei
ID: 20372674
Well I already tried using CURL, but without setting CURLOPT_NOBODY to true.

Even if I was setting the timeout to 5 seconds, the script was still loading the page more than 5 seconds. This was logical though, timeout occurs only when the requested page does not respond for 5 seconds.

Will setting CURLOPT_NOBODY to true change this and make the script work as I intend ?
0
 
LVL 20

Accepted Solution

by:
steelseth12 earned 500 total points
ID: 20372793
No it should timeout after 5 secs no matter if the page id responding or not.

Here is some code i used to test.
print date("h:i:s")."============";

$ch = curl_init();

curl_setopt($ch, CURLOPT_URL, "http://localhost/test/sleep.php");

//curl_setopt($ch, CURLOPT_NOBODY, true);

curl_setopt($ch, CURLOPT_TIMEOUT, 5);

curl_exec($ch);

curl_close($ch);

print date("h:i:s");
 

### sleep.php ####
 

<?

	$text = "";

	sleep(4);

	

	$text .= "text1\n";

	

	sleep(4);

	

	$text .= "text2\n";

	

	sleep(4);

	

	$text .= "text3\n";

	

	sleep(4);

	

	$text .= date("h:i:s");

	

	$h = fopen("test_execution.txt","w");

	

	fwrite($h,$text);
 
 
 
 

?>

Open in new window

0

Featured Post

Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Part of the Global Positioning System A geocode (https://developers.google.com/maps/documentation/geocoding/) is the major subset of a GPS coordinate (http://en.wikipedia.org/wiki/Global_Positioning_System), the other parts being the altitude and t…
Password hashing is better than message digests or encryption, and you should be using it instead of message digests or encryption.  Find out why and how in this article, which supplements the original article on PHP Client Registration, Login, Logo…
The viewer will learn how to dynamically set the form action using jQuery.
The viewer will learn how to look for a specific file type in a local or remote server directory using PHP.

895 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

15 Experts available now in Live!

Get 1:1 Help Now