Want to win a PS4? Go Premium and enter to win our High-Tech Treats giveaway. Enter to Win

x
?
Solved

Regex find PHP

Posted on 2012-04-12
10
Medium Priority
?
252 Views
Last Modified: 2012-04-13
Ok,

So Im trying to make a page that would allow visitor to like our Facebook fanpage and receive free goods in exchange.

So Im using PHP to extract amount of likes. The $content includes the following code:
<div class="fsm fwn fcg">2,704,886 likes · 80,715 talking about this</div>

Open in new window


The value Im trying to extract is 2,704,886. Any ideas what regex pattern do I need to achieve this?
0
Comment
Question by:GVNPublic123
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
10 Comments
 
LVL 35

Accepted Solution

by:
gr8gonzo earned 2000 total points
ID: 37837175
$str = '<div class="fsm fwn fcg">2,704,886 likes · 80,715 talking about this</div>';
if(preg_match("/([0-9]+) likes/",str_replace(",","",$str),$matches))
{
   echo $matches[1];
}
0
 

Author Comment

by:GVNPublic123
ID: 37837251
Actually I just found out I cant file_get_contents the facebook page, as it returns Uncompatible Browser...I guess they have protected themselves from scrapers...

Any idea how could I access that info...maybe API?
0
 
LVL 35

Expert Comment

by:gr8gonzo
ID: 37837279
If it's a public page, you could try to use cURL to get it and pretend to be a normal browser. Use this example from the PHP man page:

<?php 

$GVNPublic123 = new cURL(); 
$html = $GVNPublic123->get('http://www.facebook.com/yourpagetoscrape'); 

// Do your scraping on $html here
// $str = '<div class="fsm fwn fcg">2,704,886 likes · 80,715 talking about this</div>';
$str = $html;
if(preg_match("/([0-9]+) likes/",str_replace(",","",$str),$matches))
{
   echo $matches[1];
}

class cURL { 
var $headers; 
var $user_agent; 
var $compression; 
var $cookie_file; 
var $proxy; 
function cURL($cookies=TRUE,$cookie='cookies.txt',$compression='gzip',$proxy='') { 
$this->headers[] = 'Accept: image/gif, image/x-bitmap, image/jpeg, image/pjpeg'; 
$this->headers[] = 'Connection: Keep-Alive'; 
$this->headers[] = 'Content-type: application/x-www-form-urlencoded;charset=UTF-8'; 
$this->user_agent = 'Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 1.0.3705; .NET CLR 1.1.4322; Media Center PC 4.0)'; 
$this->compression=$compression; 
$this->proxy=$proxy; 
$this->cookies=$cookies; 
if ($this->cookies == TRUE) $this->cookie($cookie); 
} 
function cookie($cookie_file) { 
if (file_exists($cookie_file)) { 
$this->cookie_file=$cookie_file; 
} else { 
fopen($cookie_file,'w') or $this->error('The cookie file could not be opened. Make sure this directory has the correct permissions'); 
$this->cookie_file=$cookie_file; 
fclose($this->cookie_file); 
} 
} 
function get($url) { 
$process = curl_init($url); 
curl_setopt($process, CURLOPT_HTTPHEADER, $this->headers); 
curl_setopt($process, CURLOPT_HEADER, 0); 
curl_setopt($process, CURLOPT_USERAGENT, $this->user_agent); 
if ($this->cookies == TRUE) curl_setopt($process, CURLOPT_COOKIEFILE, $this->cookie_file); 
if ($this->cookies == TRUE) curl_setopt($process, CURLOPT_COOKIEJAR, $this->cookie_file); 
curl_setopt($process,CURLOPT_ENCODING , $this->compression); 
curl_setopt($process, CURLOPT_TIMEOUT, 30); 
if ($this->proxy) curl_setopt($process, CURLOPT_PROXY, $this->proxy); 
curl_setopt($process, CURLOPT_RETURNTRANSFER, 1); 
curl_setopt($process, CURLOPT_FOLLOWLOCATION, 1); 
$return = curl_exec($process); 
curl_close($process); 
return $return; 
} 
function post($url,$data) { 
$process = curl_init($url); 
curl_setopt($process, CURLOPT_HTTPHEADER, $this->headers); 
curl_setopt($process, CURLOPT_HEADER, 1); 
curl_setopt($process, CURLOPT_USERAGENT, $this->user_agent); 
if ($this->cookies == TRUE) curl_setopt($process, CURLOPT_COOKIEFILE, $this->cookie_file); 
if ($this->cookies == TRUE) curl_setopt($process, CURLOPT_COOKIEJAR, $this->cookie_file); 
curl_setopt($process, CURLOPT_ENCODING , $this->compression); 
curl_setopt($process, CURLOPT_TIMEOUT, 30); 
if ($this->proxy) curl_setopt($process, CURLOPT_PROXY, $this->proxy); 
curl_setopt($process, CURLOPT_POSTFIELDS, $data); 
curl_setopt($process, CURLOPT_RETURNTRANSFER, 1); 
curl_setopt($process, CURLOPT_FOLLOWLOCATION, 1); 
curl_setopt($process, CURLOPT_POST, 1); 
$return = curl_exec($process); 
curl_close($process); 
return $return; 
} 
function error($error) { 
echo "<center><div style='width:500px;border: 3px solid #FFEEFF; padding: 3px; background-color: #FFDDFF;font-family: verdana; font-size: 10px'><b>cURL Error</b><br>$error</div></center>"; 
die; 
} 
} 
?> 

Open in new window

0
Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 

Author Comment

by:GVNPublic123
ID: 37837379
Yep, feeding it useragent and other headers via curl did the trick. However the regex doesnt work for me.
0
 
LVL 35

Expert Comment

by:gr8gonzo
ID: 37837506
It is probably just an issue with the content being on multiple lines. The regex I gave you was for a single line of HTML while the cURL class is returning the whole page of HTML.

What is the URL?
0
 
LVL 10

Expert Comment

by:pfrancois
ID: 37837563
0
 
LVL 111

Expert Comment

by:Ray Paseur
ID: 37837867
Sometimes it's easier to use explode() than regex, but regex is good for tidying up the results string.
<?php // RAY_temp_gvn.php
error_reporting(E_ALL);

$doc = <<<DOC
Lots of
Random Stuff
// READ FROM THE WEB SITE WITH A CURL REQUEST...
<div class="fsm fwn fcg">2,704,886 likes · 80,715 talking about this</div>
Even more stuff
DOC;

$arr = explode('fsm fwn fcg', $doc);
$arr = explode('likes', $arr[1]);
$num = preg_replace('/[^0-9]/', NULL, $arr[0]);
var_dump($num);

Open in new window

http://www.laprbass.com/RAY_temp_gvn.php
0
 

Author Comment

by:GVNPublic123
ID: 37837963
Yeh, exploding totally made it easier as theres only 1 match now.

Do you guys have any idea how could I get the profile picture (in small format) so I can make a nice widget with it? Than I detect when follow is done on like button via JS api and thats it (I already know how to do that)...

Now I just need an image to go with likes count.
0
 

Author Comment

by:GVNPublic123
ID: 37838189
This is code of image:
class="scaledImageFitWidth img" src="https://fbcdn-profile-a.akamaihd.net/hprofile-ak-snc4/41581_82061850555_1078443985_n.jpg"

Open in new window

0
 

Author Comment

by:GVNPublic123
ID: 37838622
Id like to get code of image with regex, but I dont know how to make a pattern.
0

Featured Post

Free Tool: Port Scanner

Check which ports are open to the outside world. Helps make sure that your firewall rules are working as intended.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Part of the Global Positioning System A geocode (https://developers.google.com/maps/documentation/geocoding/) is the major subset of a GPS coordinate (http://en.wikipedia.org/wiki/Global_Positioning_System), the other parts being the altitude and t…
This article discusses how to implement server side field validation and display customized error messages to the client.
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
The viewer will learn how to count occurrences of each item in an array.
Suggested Courses

618 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question