Solved

Another regular expression challenge

Posted on 2010-09-04
17
522 Views
Last Modified: 2012-05-10
Hi,
I have a script where part of it pulls a single piece of data off of this page

http://siteexplorer.search.yahoo.com/search;_ylt=A0oG7za5MoNMHSEBnYjbl8kF?p=http%3A%2F%2Fwww.wausaulaw.com%2F&y=Explore+URL&fr=sfp

The piece of data I am trying to pull is the    54   within    Inlinks (54)

The html around it looks like this:
>Inlinks (54)<i class="tl"></i>

The regular expression I am using is:
 preg_match('\/>Inlinks ((.*))<i class="tl">\', $data, $p);
which is on line 78 of the data.class.php script attached (which is one of 2 scripts that makes this work).

caan someone please help me with the regular expression that will pull the 54 in the example mentioned above.   Right now it's pulling a 157 and I have no idea where that is coming from.

to see the script in action please go here http://prontopage.net/localsearch/bulkcheck.php and enter"
wausaulaw.com to see the 157 result under Yahoo Backlinks.

Help would be greatly appreciated...
.
//backcheck.php

<?php include('data.class.php'); ?>
<html>
<head>
<title>SEO checker script</title>

<script language=Javascript>

function textCounter(field, countfield, maxlimit) {
if (field.value.length > maxlimit) // if too long...trim it!
field.value = field.value.substring(0, maxlimit);
// otherwise, update 'characters left' counter
else
countfield.value = maxlimit - field.value.length;
}



</script>
</head>
<body>
<center>

<br><br>

<form name="form1" method="post" action="<?php $_SERVER['PHP_SELF']; ?>" id="form1">
    <label><strong>Urls:</strong></label>
    &nbsp;&nbsp;&nbsp;<textarea cols="35" rows="4" name="urls" id="urls" onKeyDown="textCounter(this.form.urls,this.form.remLen,125);" onKeyUp="textCounter(this.form.urls,this.form.remLen,125);"></textarea>
    <br><br>
    .<br>
    <input type="submit" name="button" id="button" value="Submit" />
    <input type="hidden" name="submitted" value="TRUE" />
<input readonly type=hidden name=remLen size=3 maxlength=3 value="125">
</form>

<br><br><br><br><br>

<!-- Begin Results -->
<table border="1">
<tr><td>Screenshot</td><td>Website Url</td> <td>Pagerank</td> <td>Dmoz</td> <td>Yahoo Directory</td> <td> Yahoo Backlinks </td> <td>Domain Age</td></tr>
<?php
        if ($results) {
    ?>
    <?php
            foreach ($results as $result) {
    ?>




<?php
//$compete = $result['url'];
$screenshot = $result['thumb'];
$url = $result['url'];
$pagerank = $result['pagerank'];
$alexarank = $result['alexarank'];
$dmoz = ($result['dmoz']) ? 'Yes' : 'No';
$yahoodir = ($result['yahooDirectory']) ? 'Yes' : 'No';
$googlebl = $result['backlinksGoogle'];
$yahoobl = $result['backlinksYahoo'];
$bingbl = $result['backlinksBing'];
$askbl = $result['backlinksAsk'];
$altavista = $result['altavista'];
$alltheweb = $result['alltheweb'];
$estibot = $result['estibot'];
$domainage = $result['age'];

$a = array("<img src="."$screenshot","$url", "$pagerank","$dmoz","$yahoodir", "$yahoobl","$domainage");   

    foreach($a as $row){
       echo "<td> $row </td>";
    }
   ?>

<?php
            }

        }
    ?>
       </table>
    <br>



<br />
<br>
<br>
<br>
</center>
</body>
</html>

------------------------------
//data.class.php

<?php

    class pagerank {
        
        var $url;
        
        function pagerank ($url) {
            $this->url = parse_url('http://' . ereg_replace('^http://', '', $url));
            $this->url['full'] = 'http://' . ereg_replace('^http://', '', $url);
        }

        function getPage ($url) {
            if (function_exists('curl_init')) {
                $ch = curl_init($url);
                curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
                @curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
                curl_setopt($ch, CURLOPT_USERAGENT, $_SERVER['HTTP_USER_AGENT']);
                curl_setopt($ch, CURLOPT_REFERER, 'http://www.google.com/search?hl=en&q=google&btnG=Google+Search');
                return curl_exec($ch);
            } else {
                return file_get_contents($url);
            }
        }

        function getPagerank () {
            $url = 'info:' . $this->url['host'];
            $checksum = $this->checksum($this->strord($url));
            $url = "http://www.google.com/search?client=navclient-auto&ch=6$checksum&features=Rank&q=$url";
         $drbug_url = "http://www.google.com/search?client=navclient-auto&ch=6$checksum&features=Rank&q=$url";
            $data = $this->getPage($url);
            preg_match('#Rank_[0-9]:[0-9]:([0-9]+){1,}#si', $data, $p);
            $value = ($p[1]) ? $p[1] : 0;
            return $value;
        }

        function getAlexaRank ($url) {
        $xml = simplexml_load_file('http://data.alexa.com/data?cli=10&dat=s&url=' . $url);
        return $xml->SD->POPULARITY['TEXT'];
        }
        
        function getDmoz () {
            $url = ereg_replace('^www\.', '', $this->url['host']);
            $url = "http://search.dmoz.org/cgi-bin/search?search=$url";
            $data = $this->getPage($url);
            if (ereg('<center>No <b><a href="http://dmoz\.org/">Open Directory Project</a></b> results found</center>', $data)) {
                $value = false;
            } else {
                $value = true;
            }
            return $value;
        }
        
        function getYahooDirectory () {
            $url = ereg_replace('^www\.', '', $this->url['host']);
            $url = "http://search.yahoo.com/search/dir?p=$url";
            $data = $this->getPage($url);
            if (ereg('No Directory Search results were found\.', $data)) {
                $value = false;
            } else {
                $value = true;
            }
            return $value;
        }
        
//        function getBacklinksGoogle () {
//            $url = $this->url['host'];
//            $url = 'http://www.google.com/search?q=' . urlencode($url);
//            $data = $this->getPage($url);
//            preg_match('/of about \<b\>([0-9\,]+)\<\/b\>/si', $data, $p);
//            $value = ($p[1]) ? number_format($this->toInt($p[1])) : 0;
//            return $value;
//        }

        function getBacklinksYahoo () {
            $url = $this->url['host'];
            $url = 'https://siteexplorer.search.yahoo.com/advsearch?p=http%3A%2F%2F' . urlencode("http://$url")."&bwm=i&bwmo=d&bwmf=u";
			$data = $this->getPage($url);
            preg_match('\/>Inlinks ((.*))<i class="tl">\', $data, $p);
			$value = ($p[1]) ? number_format($this->toInt($p[1])) : 0;
            return $value;
         }
		 
		 
			 
         
//         function getBacklinksBing () {
//            $url = $this->url['host'];
//            $url = 'http://www.bing.com/search?q=' . urlencode($url);
//            $data = $this->getPage($url);
//            preg_match('#of ([0-9\,]+) results</span>#si', $data, $p);
//            $value = ($p[1]) ? number_format($this->toInt($p[1])) : 0;
//            return $value;
//         }
         
//         
//        
//        function getResultsAltaVista () {
//            $url = $this->url['host'];
//            $url = 'http://www.altavista.com/web/results?q=' . urlencode($url);
//            $data = $this->getPage($url);
//            preg_match('#AltaVista found ([0-9,]+){1,} results#si', $data, $p);
//            $value = ($p[1]) ? number_format($this->toInt($p[1])) : 0;
//            return $value;
//        }
//        
//        function getResultsAllTheWeb () {
//            $url = $this->url['host'];
//            $url = 'http://www.alltheweb.com/search?q=' . urlencode($url);
//            $data = $this->getPage($url);
//            preg_match('#<span class="ofSoMany">([0-9,]+){1,}</span>#si', $data, $p);
//            $value = ($p[1]) ? number_format($this->toInt($p[1])) : 0;
//            return $value;
//        }
//      
//              function getBacklinksAsk () {
//            $url = $this->url['host'];
//            $url = 'http://www.ask.com/web?q=' . urlencode($url);
//            $data = $this->getPage($url);
//            preg_match('#of ([0-9,]+){1,} for#si', $data, $p);
//            $value = ($p[1]) ? number_format($this->toInt($p[1])) : 0;
//            return $value;
//        }
//      
//              function getValueEstibot () {
//            $url = $this->url['host'];
//            $url = "http://estibot.com/results.php?domain=$url";
//            $data = $this->getPage($url);
//            preg_match('#<span class="bold_text">
//USD ([a-z0-9,]+)   </span>#si', $data, $p);
//            $value = ($p[1]) ? number_format($this->toInt($p[1])) : 0;
//            return $value;
//        }

        function getAge () {
            $url = ereg_replace('^www\.', '', $this->url['host']);
            $url = "http://who.is/whois/$url";
            $data = $this->getPage($url);
            preg_match('#Creation Date: ([a-z0-9-]+)#si', $data, $p);
            if ($p[1]) {
                $time = time() - strtotime($p[1]);
                $years = round($time / 31556926);
                $days = round(($time % 31556926) / 86400);
                $value = "$years years, $days days";
            } else {
                $value = 'Unknown';
            }
            return $value;
        }
        
        function toInt ($string) {
            return preg_replace('#[^0-9]#si', '', $string);
        }
        
        function to_int_32 (&$x) {
            $z = hexdec(80000000);
            $y = (int) $x;
            if($y ==- $z && $x <- $z){
                $y = (int) ((-1) * $x);
                $y = (-1) * $y;
            }
            $x = $y;
        }

        function zero_fill ($a, $b) {
            $z = hexdec(80000000);
            if ($z & $a) {
                $a = ($a >> 1);
                $a &= (~$z);
                $a |= 0x40000000;
                $a = ($a >> ($b - 1));
            } else {
                $a = ($a >> $b);
            }
            return $a;
        }

        function mix ($a, $b, $c) {
            $a -= $b; $a -= $c; $this->to_int_32($a); $a = (int)($a ^ ($this->zero_fill($c, 13)));
            $b -= $c; $b -= $a; $this->to_int_32($b); $b = (int)($b ^ ($a << 8));
            $c -= $a; $c -= $b; $this->to_int_32($c); $c = (int)($c ^ ($this->zero_fill($b, 13)));
            $a -= $b; $a -= $c; $this->to_int_32($a); $a = (int)($a ^ ($this->zero_fill($c, 12)));
            $b -= $c; $b -= $a; $this->to_int_32($b); $b = (int)($b ^ ($a << 16));
            $c -= $a; $c -= $b; $this->to_int_32($c); $c = (int)($c ^ ($this->zero_fill($b, 5)));
            $a -= $b; $a -= $c; $this->to_int_32($a); $a = (int)($a ^ ($this->zero_fill($c, 3)));
            $b -= $c; $b -= $a; $this->to_int_32($b); $b = (int)($b ^ ($a << 10));
            $c -= $a; $c -= $b; $this->to_int_32($c); $c = (int)($c ^ ($this->zero_fill($b, 15)));
            return array($a,$b,$c);
        }

        function checksum ($url, $length = null, $init = 0xE6359A60) {
            if (is_null($length)) {
                $length = sizeof($url);
            }
            $a = $b = 0x9E3779B9;
            $c = $init;
            $k = 0;
            $len = $length;
            while($len >= 12) {
                $a += ($url[$k + 0] + ($url[$k + 1] << 8) + ($url[$k + 2] << 16) + ($url[$k +3] << 24));
                $b += ($url[$k + 4] + ($url[$k + 5] << 8) + ($url[$k + 6] << 16) + ($url[$k +7] << 24));
                $c += ($url[$k + 8] + ($url[$k + 9] << 8) + ($url[$k + 10] << 16) + ($url[$k +11] << 24));
                $mix = $this->mix($a, $b, $c);
                $a = $mix[0]; $b = $mix[1]; $c = $mix[2];
                $k += 12;
                $len -= 12;
            }
            $c += $length;
            switch($len) {
                case 11: $c += ($url[$k + 10] << 24);
                case 10: $c += ($url[$k + 9] << 16);
                case 9 : $c += ($url[$k + 8] << 8);
                case 8 : $b += ($url[$k + 7] << 24);
                case 7 : $b += ($url[$k + 6] << 16);
                case 6 : $b += ($url[$k + 5] << 8);
                case 5 : $b += ($url[$k + 4]);
                case 4 : $a += ($url[$k + 3] << 24);
                case 3 : $a += ($url[$k + 2] << 16);
                case 2 : $a += ($url[$k + 1] << 8);
                case 1 : $a += ($url[$k + 0]);
            }
            $mix = $this->mix($a, $b, $c);
            return $mix[2];
        }

        function strord ($string) {
            for($i = 0; $i < strlen($string); $i++) {
                $result[$i] = ord($string{$i});
            }
            return $result;
        }

    }
$options = array(
        'pagerank' => true,
        'dmoz' => true,
        'yahooDirectory' => true,
        'backlinksYahoo' => true,
        'backlinksGoogle' => true,
      'backlinksBing' => true,
      'backlinksAsk' => true,
        'altavista' => true,
        'alltheweb' => true,
        'alexarank' => true,
        'age' => true,
      'estibot' => true,
        'thumb' => true
    );
    if ($_POST['urls']) {
        $rep=array("\r"," ");
        $_POST['urls']=str_replace($rep,'',$_POST['urls']);
        $urls = split("\n", $_POST['urls']);
        $results = array();
        foreach ($urls as $url) {
            if(!empty($url)||trim($url!='')){
                $data = new pagerank(trim($url));
                $results[] = array(
                    'url' => $data->url['host'],
                    'pagerank' => $data->getPagerank(),
                    'dmoz' => $data->getDmoz(),
                    'yahooDirectory' => $data->getYahooDirectory(),
                    'backlinksYahoo' => $data->getBacklinksYahoo(),
                    'backlinksGoogle' => $data->getBacklinksGoogle(),
               'backlinksBing' => $data->getBacklinksBing(),
               'backlinksAsk' => $data->getBacklinksAsk(),
                    'altavista' => $data->getResultsAltaVista(),
                    'alltheweb' => $data->getResultsAllTheWeb(),
                    'alexarank' => $data->getAlexaRank($url),
                    'age' => $data->getAge(),
               'estibot' => $data->getValueEstibot(),
                    'thumb' =>"http://images.websnapr.com/?size=T&key=C5VIuGtdv2Kd&url=".$url
               

                );
            }
        }
    }

?>

Open in new window

0
Comment
Question by:chrisj1963
  • 7
  • 5
  • 2
  • +2
17 Comments
 
LVL 30

Expert Comment

by:Marco Gasi
ID: 33605741
Hi chris. Now I've not time to analyze all your code but since you'relooking for "inlink" and a number Iweould'nt use (.*) that look for any char but I would use ([0-9]*)

preg_match('\/>Inlinks (([0-9]*))<i class="tl">\', $data, $p);

Bye
0
 

Author Comment

by:chrisj1963
ID: 33605755
Thanks for looking at it Marqus.  I tried that and unfortunately it give me another bad result of 155.  

        function getBacklinksYahoo () {
            $url = $this->url['host'];
            $url = 'https://siteexplorer.search.yahoo.com/advsearch?p=http%3A%2F%2F' . urlencode("http://$url")."&bwm=i&bwmo=d&bwmf=u";
                  $data = $this->getPage($url);
            //preg_match('\/>Inlinks ((.*))<i class="tl">\', $data, $p);
                  preg_match('\/>Inlinks (([0-9]*))<i class="tl">\', $data, $p);
                  $value = ($p[1]) ? number_format($this->toInt($p[1])) : 0;
            return $value;
         }

I just cant seem to get the right regular expression to pull that  54...
0
 
LVL 30

Expert Comment

by:Marco Gasi
ID: 33605779
Now I really have to go, but I tested your onlòine example and it returns 161 not 157 *laughing*

I copied your code and tested it in localhost and it returns nothing at all. Later I'll work on it (your code is very interesting) but can you provide the exact code of the online script so I can test it from localhost and make some experiment?

Bye
0
 
LVL 4

Expert Comment

by:ashishgamre11
ID: 33605828
Hi chrisj1963,

I did not analyze all of your code.
I think you want to pull the integer value out of the link "http://siteexplorer.search.yahoo.com/search;_ylt=A0oG7za5MoNMHSEBnYjbl8kF?p=http%3A%2F%2Fwww.wausaulaw.com%2F&y=Explore+URL&fr=sfp".

So, I made some changes to "data.class.php".

The changes are as follows:


----------------------------------------------data.class.php-------------------------------------------------
<?php
class pagerank {      
    var $url;
       
        function getPage ($url) {
            if (function_exists('curl_init')) {
                $ch = curl_init($url);
                curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
                @curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
                curl_setopt($ch, CURLOPT_USERAGENT, $_SERVER['HTTP_USER_AGENT']);
                curl_setopt($ch, CURLOPT_REFERER, 'http://www.google.com/search?hl=en&q=google&btnG=Google+Search');
                return curl_exec($ch);
            } else {
                return file_get_contents($url);
            }
        }
                                 
        function getBacklinksYahoo() {
            $url = "http://siteexplorer.search.yahoo.com/search;_ylt=A0oG7za5MoNMHSEBnYjbl8kF?p=http%3A%2F%2Fwww.wausaulaw.com%2F&y=Explore+URL&fr=sfp";
            $data = $this->getPage($url);        
            preg_match('/>Inlinks ((.*))<i class=\"tl\">/', $data, $p);
            $value = ($p[1]) ? number_format($this->toInt($p[1])) : 0;
            return $value;
         }
         
         function toInt ($string) {
            return preg_replace('#[^0-9]#si', '', $string);
        }
    }
   
    $data = new pagerank("http://siteexplorer.search.yahoo.com/search;_ylt=A0oG7za5MoNMHSEBnYjbl8kF?p=http%3A%2F%2Fwww.wausaulaw.com%2F&y=Explore+URL&fr=sfp");
    $data->getBacklinksYahoo() ;
?>
--------------------------------------------------------------------------------------------------------------------

The code is running properly as per your requirement.

I changed preg_match statement to:
            preg_match('/>Inlinks ((.*))<i class=\"tl\">/', $data, $p);  
0
 

Author Comment

by:chrisj1963
ID: 33606306
@ashishgamre11 = thank you for that, but still not getting the right result

If I go to :
 http://prontopage.net/localsearch/bulkcheck.php

And Enter
wausaulaw.com
prontopage.com

I get result.jpg (see attached)
with results being
Yahoo backlinks for wausaulaw.com  at 152
Yahoo backlinks for prontopage.com at 1,240

When the results should be
54 for wausaulaw.com   see http://siteexplorer.search.yahoo.com/search;_ylt=A0oG7za5MoNMHSEBnYjbl8kF?p=http%3A%2F%2Fwww.wausaulaw.com%2F&y=Explore+URL&fr=sfp
AND
138 for prontopage.com  see http://siteexplorer.search.yahoo.com/search;_ylt=A0oG7zSWjINMGj4BEwXal8kF?p=http%3A%2F%2Fwww.prontopage.com%2F&y=Explore+URL&fr=sfp

any other thoughts?
Thanks
result.JPG
0
 

Author Comment

by:chrisj1963
ID: 33606314
Hey Marqus - the entire script is attached above. There are actually 2 separate scripts.  
bulkcheck.php is the input and result page
data.class.php  has the function with the regular expression around line 80 of that separate script.
Thanks!
0
 
LVL 30

Expert Comment

by:Marco Gasi
ID: 33606356
Yes I've noticed this (this morning I've not noticed instead). But - you want laughing? Now that i have separated scripts the result is a blank page!! However now I go to study a lot to understand what goes wrong. Bye
0
 
LVL 108

Expert Comment

by:Ray Paseur
ID: 33606459
REGEX is often a confusing way to deal with simple string manipulation.  Try installing this code and see what you get.  Best, ~Ray
<?php // RAY_temp_scrape.php
error_reporting(E_ALL);




// TEST DATA FROM THE POST AT EE
$url = 'http://siteexplorer.search.yahoo.com/search;_ylt=A0oG7za5MoNMHSEBnYjbl8kF?p=http%3A%2F%2Fwww.wausaulaw.com%2F&y=Explore+URL&fr=sfp';

// RUN THE FUNCTION TO GET THE NUMBER
$cnt = scrape_inlinks($url);

// SHOW THE WORK PRODUCT
echo 'Inlinks=' . number_format($cnt);




// A FUNCTION TO SCRAPE THE "Inlinks" VALUE OUT OF A WEB PAGE
function scrape_inlinks($url)
{
    // READ THE PAGE
    $htm = file_get_contents($url);

    // ACTIVATE THIS TO SEE THE PAGE
    // echo htmlentities($htm);

    // LOCATE 'Inlinks'
    $arr = explode('Inlinks', $htm);

    // FIND CLOSING PAREN
    $arr = explode(')', $arr[1]);

    // FIND COUNT OF INLINKS
    return preg_replace('/[^0-9]/', '', $arr[0]);
}

Open in new window

0
IT, Stop Being Called Into Every Meeting

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

 

Author Comment

by:chrisj1963
ID: 33606557
Hey Ray - thank you. I agree, regex is maddening.  I cannot get my mind around it...
Actually, I  had taken another script you shared and was able to pull that result:
http://www.prontopage.net/localsearch/_script2_yahooBLP.php (enter wausaulaw.com) my problem is that I have no clue how to pull such a script into my original script above.  Is there a way to hook into the above script as a function? If, so How would i do that?  OR should I/can I add that code within my original script.  I just don't know how or where that would be added?  I just don't understand how I would insert
echo 'Inlinks=' . number_format($cnt); within the array in the original script.

Additional input would be appreciated.
thanks very much
0
 
LVL 108

Accepted Solution

by:
Ray Paseur earned 125 total points
ID: 33606610
This book might be helpful to you as you learn how to put PHP scripts together.
http://www.sitepoint.com/books/phpmysql4/

You can usually just copy functions and add them to other scripts.  But I think script integration and function scope are really different sorts of questions from the question about isolating a substring from a longer string.  I looked at your script above and I have to admit I cannot follow its logic.  It contains hundreds of lines of code, a lot of it commented out, without consistent controls structures or comments.  That makes integration something of a research project.
0
 
LVL 4

Assisted Solution

by:ashishgamre11
ashishgamre11 earned 125 total points
ID: 33606823
I think there is an error on line number 172

===> $url = 'https://siteexplorer.search.yahoo.com/advsearch?p=http%3A%2F%2F' . urlencode("http://$url")."&bwm=i&bwmo=d&bwmf=u";

in function getBacklinksYahoo.
            

How do you get "http://siteexplorer.search.yahoo.com/search;_ylt=A0oG7za5MoNMHSEBnYjbl8kF?p=http%3A%2F%2Fwww.wausaulaw.com%2F&y=Explore+URL&fr=sfp" using line number 172?
0
 
LVL 30

Expert Comment

by:Marco Gasi
ID: 33606840
Hi chris. I have to say that I have many difficults to test your code: for some reason I had to comment curl call in getPage function and use only file_get_contents otherwise I got false as result. But using file_get_contents and displayng results in my own page I saw there was not Inlinks button nor other elements of the original page.

Anyway, I found some errors and I tell you about shortly.

First, your reg exp is wrong: () are a special symbol wich means 'group' so if you wish find these signs you have to use backslash: preg_match("/(\([0-9]+\))/", $data, $p);

But here is the second problem: supposing the result of this statement be (53), this will be the unique array element! With your original regexp the array $p was formed by an only one element: '>Inlinks (53)<i class="t1">'. What is wondering is that passing this to number_format function you got a number!.

Now I'll go to play a bit with regexp to see if I'm able to find the correct result in an test environment. If I find correct regexp I'll communicate to you to allw you to test it with your code.

Bye
0
 
LVL 16

Assisted Solution

by:HackneyCab
HackneyCab earned 125 total points
ID: 33606847
If you're trying to grab the 54 from

>Inlinks (54)<i class="tl"></i>

then try using

'#>Inlinks\s\(([0-9]+)\)<i#'

as your regex pattern. In previous suggested patterns, the parentheses have not been escaped, which is a mistake because parentheses are special characters in PCRE.
0
 
LVL 30

Expert Comment

by:Marco Gasi
ID: 33606931
Well, a right regular expression could be this one:

(\d*)(?=\)<i class="tl">)

This find groups of numbers followed by a parenthesis and  <i lass="tl">. Im trying to work with a regexp wich find numbers preceeded by Inlinks but waiting for better regexp you can try with this one.

Hope this helps, chris.

Bye
0
 
LVL 30

Assisted Solution

by:Marco Gasi
Marco Gasi earned 125 total points
ID: 33606946
Hree complete regexp:

(?<=Inlinks \()(\d*)(?=\)<i class="tl">)

So you can write

preg_match("/(?<=Inlinks \()(\d*)(?=\)<i class="tl">)/", $data, $p);
$value = $p[0];
return = $value;

If this doesn't work I give in.

Let me know...
0
 

Author Closing Comment

by:chrisj1963
ID: 33607012
Everyone. thanks for trying.  I look at other options now.
0
 
LVL 30

Expert Comment

by:Marco Gasi
ID: 33613276
Hi chris. Excuse me if I disturb with this old question, but I was not satisfied to have not found what we were looking for. But I don't disturb you to apologize my regexp. I only wuish to say that working around your script to see if regexp worked or not, I've noticed that to obtain correct result I had replace the url  you used in data.class.php. You used this:

$url = 'https://siteexplorer.search.yahoo.com/advsearch?p=http%3A%2F%2F' . urlencode("http://$url")."&bwm=i&bwmo=d&bwmf=u";

and in my tests this not worked.

I used this one:

$url = 'http://siteexplorer.search.yahoo.com/search?p=http%3A%2F%2F' . urlencode("www.$url")."%2F&y=Explore+URL&fr=sfp";

And Inlinks 55 appears as magic.

I thought that Yahoo has changed something and this changement has compromised your script function: so this could be a very hard problem for your application. I don't know how to resolve (api?) but perhaps you have to do something.

Cheers.

marqusG
0

Featured Post

What Should I Do With This Threat Intelligence?

Are you wondering if you actually need threat intelligence? The answer is yes. We explain the basics for creating useful threat intelligence.

Join & Write a Comment

Suggested Solutions

Generating table dynamically is the most common issue faced by php developers.... So it seems there is a need of an article that explains the basic concept of generating tables dynamically. It just requires a basic knowledge of html and little maths…
This article discusses four methods for overlaying images in a container on a web page
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
The viewer will learn how to create a basic form using some HTML5 and PHP for later processing. Set up your basic HTML file. Open your form tag and set the method and action attributes.: (CODE) Set up your first few inputs one for the name and …

760 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

17 Experts available now in Live!

Get 1:1 Help Now