Solved

Regex find PHP

Posted on 2012-04-12
10
242 Views
Last Modified: 2012-04-13
Ok,

So Im trying to make a page that would allow visitor to like our Facebook fanpage and receive free goods in exchange.

So Im using PHP to extract amount of likes. The $content includes the following code:
<div class="fsm fwn fcg">2,704,886 likes · 80,715 talking about this</div>

Open in new window


The value Im trying to extract is 2,704,886. Any ideas what regex pattern do I need to achieve this?
0
Comment
Question by:GVNPublic123
10 Comments
 
LVL 34

Accepted Solution

by:
gr8gonzo earned 500 total points
ID: 37837175
$str = '<div class="fsm fwn fcg">2,704,886 likes · 80,715 talking about this</div>';
if(preg_match("/([0-9]+) likes/",str_replace(",","",$str),$matches))
{
   echo $matches[1];
}
0
 

Author Comment

by:GVNPublic123
ID: 37837251
Actually I just found out I cant file_get_contents the facebook page, as it returns Uncompatible Browser...I guess they have protected themselves from scrapers...

Any idea how could I access that info...maybe API?
0
 
LVL 34

Expert Comment

by:gr8gonzo
ID: 37837279
If it's a public page, you could try to use cURL to get it and pretend to be a normal browser. Use this example from the PHP man page:

<?php 

$GVNPublic123 = new cURL(); 
$html = $GVNPublic123->get('http://www.facebook.com/yourpagetoscrape'); 

// Do your scraping on $html here
// $str = '<div class="fsm fwn fcg">2,704,886 likes · 80,715 talking about this</div>';
$str = $html;
if(preg_match("/([0-9]+) likes/",str_replace(",","",$str),$matches))
{
   echo $matches[1];
}

class cURL { 
var $headers; 
var $user_agent; 
var $compression; 
var $cookie_file; 
var $proxy; 
function cURL($cookies=TRUE,$cookie='cookies.txt',$compression='gzip',$proxy='') { 
$this->headers[] = 'Accept: image/gif, image/x-bitmap, image/jpeg, image/pjpeg'; 
$this->headers[] = 'Connection: Keep-Alive'; 
$this->headers[] = 'Content-type: application/x-www-form-urlencoded;charset=UTF-8'; 
$this->user_agent = 'Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 1.0.3705; .NET CLR 1.1.4322; Media Center PC 4.0)'; 
$this->compression=$compression; 
$this->proxy=$proxy; 
$this->cookies=$cookies; 
if ($this->cookies == TRUE) $this->cookie($cookie); 
} 
function cookie($cookie_file) { 
if (file_exists($cookie_file)) { 
$this->cookie_file=$cookie_file; 
} else { 
fopen($cookie_file,'w') or $this->error('The cookie file could not be opened. Make sure this directory has the correct permissions'); 
$this->cookie_file=$cookie_file; 
fclose($this->cookie_file); 
} 
} 
function get($url) { 
$process = curl_init($url); 
curl_setopt($process, CURLOPT_HTTPHEADER, $this->headers); 
curl_setopt($process, CURLOPT_HEADER, 0); 
curl_setopt($process, CURLOPT_USERAGENT, $this->user_agent); 
if ($this->cookies == TRUE) curl_setopt($process, CURLOPT_COOKIEFILE, $this->cookie_file); 
if ($this->cookies == TRUE) curl_setopt($process, CURLOPT_COOKIEJAR, $this->cookie_file); 
curl_setopt($process,CURLOPT_ENCODING , $this->compression); 
curl_setopt($process, CURLOPT_TIMEOUT, 30); 
if ($this->proxy) curl_setopt($process, CURLOPT_PROXY, $this->proxy); 
curl_setopt($process, CURLOPT_RETURNTRANSFER, 1); 
curl_setopt($process, CURLOPT_FOLLOWLOCATION, 1); 
$return = curl_exec($process); 
curl_close($process); 
return $return; 
} 
function post($url,$data) { 
$process = curl_init($url); 
curl_setopt($process, CURLOPT_HTTPHEADER, $this->headers); 
curl_setopt($process, CURLOPT_HEADER, 1); 
curl_setopt($process, CURLOPT_USERAGENT, $this->user_agent); 
if ($this->cookies == TRUE) curl_setopt($process, CURLOPT_COOKIEFILE, $this->cookie_file); 
if ($this->cookies == TRUE) curl_setopt($process, CURLOPT_COOKIEJAR, $this->cookie_file); 
curl_setopt($process, CURLOPT_ENCODING , $this->compression); 
curl_setopt($process, CURLOPT_TIMEOUT, 30); 
if ($this->proxy) curl_setopt($process, CURLOPT_PROXY, $this->proxy); 
curl_setopt($process, CURLOPT_POSTFIELDS, $data); 
curl_setopt($process, CURLOPT_RETURNTRANSFER, 1); 
curl_setopt($process, CURLOPT_FOLLOWLOCATION, 1); 
curl_setopt($process, CURLOPT_POST, 1); 
$return = curl_exec($process); 
curl_close($process); 
return $return; 
} 
function error($error) { 
echo "<center><div style='width:500px;border: 3px solid #FFEEFF; padding: 3px; background-color: #FFDDFF;font-family: verdana; font-size: 10px'><b>cURL Error</b><br>$error</div></center>"; 
die; 
} 
} 
?> 

Open in new window

0
 

Author Comment

by:GVNPublic123
ID: 37837379
Yep, feeding it useragent and other headers via curl did the trick. However the regex doesnt work for me.
0
 
LVL 34

Expert Comment

by:gr8gonzo
ID: 37837506
It is probably just an issue with the content being on multiple lines. The regex I gave you was for a single line of HTML while the cURL class is returning the whole page of HTML.

What is the URL?
0
Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

 
LVL 10

Expert Comment

by:pfrancois
ID: 37837563
0
 
LVL 108

Expert Comment

by:Ray Paseur
ID: 37837867
Sometimes it's easier to use explode() than regex, but regex is good for tidying up the results string.
<?php // RAY_temp_gvn.php
error_reporting(E_ALL);

$doc = <<<DOC
Lots of
Random Stuff
// READ FROM THE WEB SITE WITH A CURL REQUEST...
<div class="fsm fwn fcg">2,704,886 likes · 80,715 talking about this</div>
Even more stuff
DOC;

$arr = explode('fsm fwn fcg', $doc);
$arr = explode('likes', $arr[1]);
$num = preg_replace('/[^0-9]/', NULL, $arr[0]);
var_dump($num);

Open in new window

http://www.laprbass.com/RAY_temp_gvn.php
0
 

Author Comment

by:GVNPublic123
ID: 37837963
Yeh, exploding totally made it easier as theres only 1 match now.

Do you guys have any idea how could I get the profile picture (in small format) so I can make a nice widget with it? Than I detect when follow is done on like button via JS api and thats it (I already know how to do that)...

Now I just need an image to go with likes count.
0
 

Author Comment

by:GVNPublic123
ID: 37838189
This is code of image:
class="scaledImageFitWidth img" src="https://fbcdn-profile-a.akamaihd.net/hprofile-ak-snc4/41581_82061850555_1078443985_n.jpg"

Open in new window

0
 

Author Comment

by:GVNPublic123
ID: 37838622
Id like to get code of image with regex, but I dont know how to make a pattern.
0

Featured Post

Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Both Easy and Powerful How easy is PHP? http://lmgtfy.com?q=how+easy+is+php (http://lmgtfy.com?q=how+easy+is+php)  Very easy.  It has been described as "a programming language even my grandmother can use." How powerful is PHP?  http://en.wikiped…
These days socially coordinated efforts have turned into a critical requirement for enterprises.
The viewer will learn how to count occurrences of each item in an array.
The viewer will learn how to create a basic form using some HTML5 and PHP for later processing. Set up your basic HTML file. Open your form tag and set the method and action attributes.: (CODE) Set up your first few inputs one for the name and …

920 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

16 Experts available now in Live!

Get 1:1 Help Now