Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people, just like you, are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
Solved

Calling pages and moving on without waiting for the page to complete

Posted on 2007-11-28
5
245 Views
Last Modified: 2012-05-05
Hello

I created a crawler that parse huge amounts of XML files and insert/update data in a database. I have uploaded this crawler on multiple servers and I want to call all of them from the main project.

In the main project I have something like:
file_get_contents("http://crawler1.net");
file_get_contents("http://crawler2.net");
file_get_contents("http://crawler3.net");

It calls crawler1, wait until it finish running, and then call crawler2.

What I want to achieve is to call crawler1, and then call crawler2 right away without waiting for crawler1 to finish loading.

I do not need anything from what a crawler outputs, they all work independently and insert/update data in the same database. All I need is to call them so they start crawling.
0
Comment
Question by:MihaiAndrei
  • 3
  • 2
5 Comments
 
LVL 20

Expert Comment

by:steelseth12
ID: 20365060
This will call each crawler at 5 sec intervals.
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "http://crawler1.net");
curl_setopt($ch, CURLOPT_NOBODY, true);
curl_setopt($ch, CURLOPT_TIMEOUT, 5);
curl_exec($ch);
curl_close($ch);
 
 
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "http://crawler2.net");
curl_setopt($ch, CURLOPT_NOBODY, true);
curl_setopt($ch, CURLOPT_TIMEOUT, 5);
curl_exec($ch);
curl_close($ch);
 
 
 
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "http://crawler3.net");
curl_setopt($ch, CURLOPT_NOBODY, true);
curl_setopt($ch, CURLOPT_TIMEOUT, 5);
curl_exec($ch);
curl_close($ch);

Open in new window

0
 

Author Comment

by:MihaiAndrei
ID: 20372470
Can you please explain me what does CURLOPT_NOBODY actually do ?
0
 
LVL 20

Expert Comment

by:steelseth12
ID: 20372658
CURLOPT_NOBODY will not return any output from the pages you open.

There is a detailed list with options at

http://www.php.net/manual/en/function.curl-setopt.php
0
 

Author Comment

by:MihaiAndrei
ID: 20372674
Well I already tried using CURL, but without setting CURLOPT_NOBODY to true.

Even if I was setting the timeout to 5 seconds, the script was still loading the page more than 5 seconds. This was logical though, timeout occurs only when the requested page does not respond for 5 seconds.

Will setting CURLOPT_NOBODY to true change this and make the script work as I intend ?
0
 
LVL 20

Accepted Solution

by:
steelseth12 earned 500 total points
ID: 20372793
No it should timeout after 5 secs no matter if the page id responding or not.

Here is some code i used to test.
print date("h:i:s")."============";
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "http://localhost/test/sleep.php");
//curl_setopt($ch, CURLOPT_NOBODY, true);
curl_setopt($ch, CURLOPT_TIMEOUT, 5);
curl_exec($ch);
curl_close($ch);
print date("h:i:s");
 
### sleep.php ####
 
<?
	$text = "";
	sleep(4);
	
	$text .= "text1\n";
	
	sleep(4);
	
	$text .= "text2\n";
	
	sleep(4);
	
	$text .= "text3\n";
	
	sleep(4);
	
	$text .= date("h:i:s");
	
	$h = fopen("test_execution.txt","w");
	
	fwrite($h,$text);
 
 
 
 
?>

Open in new window

0

Featured Post

Free Tool: Path Explorer

An intuitive utility to help find the CSS path to UI elements on a webpage. These paths are used frequently in a variety of front-end development and QA automation tasks.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

I imagine that there are some, like me, who require a way of getting currency exchange rates for implementation in web project from time to time, so I thought I would share a solution that I have developed for this purpose. It turns out that Yaho…
This article discusses how to create an extensible mechanism for linked drop downs.
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
The viewer will learn how to dynamically set the form action using jQuery.

840 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question