Solved

Calling pages and moving on without waiting for the page to complete

Posted on 2007-11-28
5
241 Views
Last Modified: 2012-05-05
Hello

I created a crawler that parse huge amounts of XML files and insert/update data in a database. I have uploaded this crawler on multiple servers and I want to call all of them from the main project.

In the main project I have something like:
file_get_contents("http://crawler1.net");
file_get_contents("http://crawler2.net");
file_get_contents("http://crawler3.net");

It calls crawler1, wait until it finish running, and then call crawler2.

What I want to achieve is to call crawler1, and then call crawler2 right away without waiting for crawler1 to finish loading.

I do not need anything from what a crawler outputs, they all work independently and insert/update data in the same database. All I need is to call them so they start crawling.
0
Comment
Question by:MihaiAndrei
  • 3
  • 2
5 Comments
 
LVL 20

Expert Comment

by:steelseth12
ID: 20365060
This will call each crawler at 5 sec intervals.
$ch = curl_init();

curl_setopt($ch, CURLOPT_URL, "http://crawler1.net");

curl_setopt($ch, CURLOPT_NOBODY, true);

curl_setopt($ch, CURLOPT_TIMEOUT, 5);

curl_exec($ch);

curl_close($ch);
 
 

$ch = curl_init();

curl_setopt($ch, CURLOPT_URL, "http://crawler2.net");

curl_setopt($ch, CURLOPT_NOBODY, true);

curl_setopt($ch, CURLOPT_TIMEOUT, 5);

curl_exec($ch);

curl_close($ch);
 
 
 

$ch = curl_init();

curl_setopt($ch, CURLOPT_URL, "http://crawler3.net");

curl_setopt($ch, CURLOPT_NOBODY, true);

curl_setopt($ch, CURLOPT_TIMEOUT, 5);

curl_exec($ch);

curl_close($ch);

Open in new window

0
 

Author Comment

by:MihaiAndrei
ID: 20372470
Can you please explain me what does CURLOPT_NOBODY actually do ?
0
 
LVL 20

Expert Comment

by:steelseth12
ID: 20372658
CURLOPT_NOBODY will not return any output from the pages you open.

There is a detailed list with options at

http://www.php.net/manual/en/function.curl-setopt.php
0
 

Author Comment

by:MihaiAndrei
ID: 20372674
Well I already tried using CURL, but without setting CURLOPT_NOBODY to true.

Even if I was setting the timeout to 5 seconds, the script was still loading the page more than 5 seconds. This was logical though, timeout occurs only when the requested page does not respond for 5 seconds.

Will setting CURLOPT_NOBODY to true change this and make the script work as I intend ?
0
 
LVL 20

Accepted Solution

by:
steelseth12 earned 500 total points
ID: 20372793
No it should timeout after 5 secs no matter if the page id responding or not.

Here is some code i used to test.
print date("h:i:s")."============";

$ch = curl_init();

curl_setopt($ch, CURLOPT_URL, "http://localhost/test/sleep.php");

//curl_setopt($ch, CURLOPT_NOBODY, true);

curl_setopt($ch, CURLOPT_TIMEOUT, 5);

curl_exec($ch);

curl_close($ch);

print date("h:i:s");
 

### sleep.php ####
 

<?

	$text = "";

	sleep(4);

	

	$text .= "text1\n";

	

	sleep(4);

	

	$text .= "text2\n";

	

	sleep(4);

	

	$text .= "text3\n";

	

	sleep(4);

	

	$text .= date("h:i:s");

	

	$h = fopen("test_execution.txt","w");

	

	fwrite($h,$text);
 
 
 
 

?>

Open in new window

0

Featured Post

How to improve team productivity

Quip adds documents, spreadsheets, and tasklists to your Slack experience
- Elevate ideas to Quip docs
- Share Quip docs in Slack
- Get notified of changes to your docs
- Available on iOS/Android/Desktop/Web
- Online/Offline

Join & Write a Comment

This article will explain how to display the first page of your Microsoft Word documents (e.g. .doc, .docx, etc...) as images in a web page programatically. I have scoured the web on a way to do this unsuccessfully. The goal is to produce something …
These days socially coordinated efforts have turned into a critical requirement for enterprises.
The viewer will learn how to count occurrences of each item in an array.
This tutorial will teach you the core code needed to finalize the addition of a watermark to your image. The viewer will use a small PHP class to learn and create a watermark.

744 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

13 Experts available now in Live!

Get 1:1 Help Now