Solved

Calling pages and moving on without waiting for the page to complete

Posted on 2007-11-28
5
248 Views
Last Modified: 2012-05-05
Hello

I created a crawler that parse huge amounts of XML files and insert/update data in a database. I have uploaded this crawler on multiple servers and I want to call all of them from the main project.

In the main project I have something like:
file_get_contents("http://crawler1.net");
file_get_contents("http://crawler2.net");
file_get_contents("http://crawler3.net");

It calls crawler1, wait until it finish running, and then call crawler2.

What I want to achieve is to call crawler1, and then call crawler2 right away without waiting for crawler1 to finish loading.

I do not need anything from what a crawler outputs, they all work independently and insert/update data in the same database. All I need is to call them so they start crawling.
0
Comment
Question by:MihaiAndrei
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
  • 2
5 Comments
 
LVL 20

Expert Comment

by:steelseth12
ID: 20365060
This will call each crawler at 5 sec intervals.
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "http://crawler1.net");
curl_setopt($ch, CURLOPT_NOBODY, true);
curl_setopt($ch, CURLOPT_TIMEOUT, 5);
curl_exec($ch);
curl_close($ch);
 
 
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "http://crawler2.net");
curl_setopt($ch, CURLOPT_NOBODY, true);
curl_setopt($ch, CURLOPT_TIMEOUT, 5);
curl_exec($ch);
curl_close($ch);
 
 
 
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "http://crawler3.net");
curl_setopt($ch, CURLOPT_NOBODY, true);
curl_setopt($ch, CURLOPT_TIMEOUT, 5);
curl_exec($ch);
curl_close($ch);

Open in new window

0
 

Author Comment

by:MihaiAndrei
ID: 20372470
Can you please explain me what does CURLOPT_NOBODY actually do ?
0
 
LVL 20

Expert Comment

by:steelseth12
ID: 20372658
CURLOPT_NOBODY will not return any output from the pages you open.

There is a detailed list with options at

http://www.php.net/manual/en/function.curl-setopt.php
0
 

Author Comment

by:MihaiAndrei
ID: 20372674
Well I already tried using CURL, but without setting CURLOPT_NOBODY to true.

Even if I was setting the timeout to 5 seconds, the script was still loading the page more than 5 seconds. This was logical though, timeout occurs only when the requested page does not respond for 5 seconds.

Will setting CURLOPT_NOBODY to true change this and make the script work as I intend ?
0
 
LVL 20

Accepted Solution

by:
steelseth12 earned 500 total points
ID: 20372793
No it should timeout after 5 secs no matter if the page id responding or not.

Here is some code i used to test.
print date("h:i:s")."============";
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "http://localhost/test/sleep.php");
//curl_setopt($ch, CURLOPT_NOBODY, true);
curl_setopt($ch, CURLOPT_TIMEOUT, 5);
curl_exec($ch);
curl_close($ch);
print date("h:i:s");
 
### sleep.php ####
 
<?
	$text = "";
	sleep(4);
	
	$text .= "text1\n";
	
	sleep(4);
	
	$text .= "text2\n";
	
	sleep(4);
	
	$text .= "text3\n";
	
	sleep(4);
	
	$text .= date("h:i:s");
	
	$h = fopen("test_execution.txt","w");
	
	fwrite($h,$text);
 
 
 
 
?>

Open in new window

0

Featured Post

[Webinar] How Hackers Steal Your Credentials

Do You Know How Hackers Steal Your Credentials? Join us and Skyport Systems to learn how hackers steal your credentials and why Active Directory must be secure to stop them. Thursday, July 13, 2017 10:00 A.M. PDT

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Foreword (July, 2015) Since I first wrote this article, years ago, a great many more people have begun using the internet.  They are coming online from every part of the globe, learning, reading, shopping and spending money at an ever-increasing ra…
Since pre-biblical times, humans have sought ways to keep secrets, and share the secrets selectively.  This article explores the ways PHP can be used to hide and encrypt information.
Learn how to match and substitute tagged data using PHP regular expressions. Demonstrated on Windows 7, but also applies to other operating systems. Demonstrated technique applies to PHP (all versions) and Firefox, but very similar techniques will w…
This tutorial will teach you the core code needed to finalize the addition of a watermark to your image. The viewer will use a small PHP class to learn and create a watermark.

717 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question