# Check pagerank in multiple cURL handles in parallel

Posted on 2009-05-18
Hi E's, in snippet code I show the function I use to check page rank of each page. In this case and in every script I have about ten url's for check the page rank, and if I check one by one its a lot of time.
I know cURL run multiple cURL handles in parallel, with "curl_multi_init", but I don't know how I use. So i need a experts helps.

Basely, what I want to do is:
-I call the url's one by one from database
-I call the function (in parallel)
-Save in database

How I do?

Regards, JC

``````<?php

function _zeroFill(\$a, \$b){
\$z = hexdec(80000000);
if (\$z & \$a){
\$a = (\$a>>1);
\$a &= (~\$z);
\$a |= 0x40000000;
\$a = (\$a>>(\$b-1));
}else
\$a = (\$a>>\$b);
return \$a;
}

function _mix(\$a,\$b,\$c){
\$a -= \$b; \$a -= \$c; \$a ^= (_zeroFill(\$c,13));
\$b -= \$c; \$b -= \$a; \$b ^= (\$a<<8);
\$c -= \$a; \$c -= \$b; \$c ^= (_zeroFill(\$b,13));
\$a -= \$b; \$a -= \$c; \$a ^= (_zeroFill(\$c,12));
\$b -= \$c; \$b -= \$a; \$b ^= (\$a<<16);
\$c -= \$a; \$c -= \$b; \$c ^= (_zeroFill(\$b,5));
\$a -= \$b; \$a -= \$c; \$a ^= (_zeroFill(\$c,3));
\$b -= \$c; \$b -= \$a; \$b ^= (\$a<<10);
\$c -= \$a; \$c -= \$b; \$c ^= (_zeroFill(\$b,15));
return array(\$a,\$b,\$c);
}

if(is_null(\$length))
\$length = sizeof(\$url);
\$a = \$b = 0x9E3779B9;
\$c = \$init;
\$k = 0;
\$len = \$length;
while(\$len >= 12){
\$a += (\$url[\$k + 0] + (\$url[\$k + 1] << 8) + (\$url[\$k + 2] << 16) + (\$url[\$k + 3] << 24));
\$b += (\$url[\$k + 4] + (\$url[\$k + 5] << 8) + (\$url[\$k + 6] << 16) + (\$url[\$k + 7] << 24));
\$c += (\$url[\$k + 8] + (\$url[\$k + 9] << 8) + (\$url[\$k + 10] << 16) + (\$url[\$k + 11] << 24));
\$_mix = _mix(\$a,\$b,\$c);
\$a = \$_mix[0]; \$b = \$_mix[1]; \$c = \$_mix[2];
\$k += 12;
\$len -= 12;
}
\$c += \$length;
switch(\$len){
case 11: \$c += (\$url[\$k + 10] << 24);
case 10: \$c += (\$url[\$k + 9] << 16);
case 9 : \$c += (\$url[\$k + 8] << 8);
case 8 : \$b += (\$url[\$k + 7] << 24);
case 7 : \$b += (\$url[\$k + 6] << 16);
case 6 : \$b += (\$url[\$k + 5] << 8);
case 5 : \$b += (\$url[\$k + 4]);
case 4 : \$a += (\$url[\$k + 3] << 24);
case 3 : \$a += (\$url[\$k + 2] << 16);
case 2 : \$a += (\$url[\$k + 1] << 8);
case 1 : \$a += (\$url[\$k + 0]);
}
\$_mix = _mix(\$a,\$b,\$c);
return \$_mix[2];
}

function _strord(\$string){
for(\$i = 0;\$i < strlen(\$string);\$i++)
\$result[\$i] = ord(\$string{\$i});
return \$result;
}

function getPageRank(\$url){
\$pagerank = -1;
\$fp = fsockopen("www.google.com", 80, \$errno, \$errstr, 30);
if(\$fp){
\$out = "GET /search?client=navclient-auto&ch=" . \$ch . "&features=Rank&q=info:" . \$url . " HTTP/1.1\r\n";
\$out .= "Connection: Close\r\n\r\n";
fwrite(\$fp, \$out);
while (!feof(\$fp)){
\$data = fgets(\$fp, 128);
\$pos = strpos(\$data, "Rank_");
if(\$pos === false){
}else
\$pagerank = substr(\$data, \$pos + 9);
}
fclose(\$fp);
}
return \$pagerank;
}

//////////////////////////////////////////
CALL URL'S FROM THE DATA BASE
\$pr = getPageRank("\$pagerankurl");
SAVE IN DATABASE

?>
``````
Question by:Pedro Chagas
1 Comment

LVL 11

Accepted Solution

BrianMM earned 1500 total points
ID: 24420552
Hi,

Recently(ish) i implemented a web scraping tool following some hints from http://www.developertutorials.com/blog/php/parallel-web-scraping-in-php-curl-multi-functions-375/ which does cURL in parallel.

Check it out see if it gives you some pointers.

If not let me know and I'll see what can be done when I have more time to spend.
0

