rlb1
asked on
How do you combine cURL and Regex in PHP
Experts,
How do I corectly structure this cURL and Regex to pull some data from a website then insert it in my database.
Thanks for your help!
How do I corectly structure this cURL and Regex to pull some data from a website then insert it in my database.
Thanks for your help!
<?php
mysql_connect("Host","UN","PW") or die("Unable to connect to SQL server");
mysql_select_db('DB') or die("Unable to SELECT DB");
echo "Connected to DB";
echo "<BR>";
$query = "SELECT * FROM temptable";
$result = mysql_query($query) or die('Error: '.mysql_error());
while($row = mysql_fetch_array($result))
{
$sku=mysql_real_escape_string($row['sku']);
echo $sku."<br />";
$pages = array('home' =>
'http://www.mysite.com/',
'login' =>
'https://www.mysite.com/signin.asp',
'schedule' =>
'http://www.mysite.com/product.asp?&sku=' . $sku . '');
$ch = curl_init();
//Set options for curl session
$options = array(CURLOPT_USERAGENT => 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.1) Gecko/20061204 Firefox/2.0.0.1)',
CURLOPT_SSL_VERIFYPEER => FALSE,
CURLOPT_SSL_VERIFYHOST => 2,
CURLOPT_HEADER => TRUE,
//CURLOPT_RETURNTRANSFER => TRUE,
CURLOPT_COOKIEFILE => '/cookie.txt',
CURLOPT_COOKIEJAR => '/cookies.txt');
//Hit home page for session cookie
$options[CURLOPT_URL] = $pages['home'];
curl_setopt_array($ch, $options);
curl_exec($ch);
//Login
$options[CURLOPT_URL] = $pages['login'];
$options[CURLOPT_POST] = TRUE;
$options[CURLOPT_POSTFIELDS] = 'email=MyEmail&password=PW&submitBtn';
$options[CURLOPT_FOLLOWLOCATION] = FALSE;
curl_setopt_array($ch, $options);
curl_exec($ch);
//Hit schedule page
$options[CURLOPT_URL] = $pages['schedule'];
curl_setopt_array($ch, $options);
$schedule = curl_exec($ch);
//Output schedule
//echo $schedule;
preg_match_all('%<title>(.*?)</title>%s',$schedule,$matches0);
$title=$matches0[1];
echo $title."<br />";
mysql_query("INSERT INTO temptable2 (title) VALUES('".mysql_real_escape_string($title)."') ");
}
//Close curl session
curl_close($ch);
?>
It looks OK on inspection. What is wrong with it?
ASKER
The cURL functions are working properly and I am getting data returned from all $pages (home, login, schedule). The REGEX is not picking up the 'Title' which is the same code that is working perfectly on another script. The script is not inserting the 'Title' to the database.
I am not sure what is wrong. I have spent several hours working on it...
I am not sure what is wrong. I have spent several hours working on it...
Change
$title=$matches0[1];
to
$title = $matches0[1][0];
$title=$matches0[1];
to
$title = $matches0[1][0];
ASKER
bportlock: Thanks. It seems that the REGEX statement may not be recognizing $schedule (which is the 3rd page in the array). The database is receiving some blank inserts with no data. I have adjusted several things in the script and cannot come up with a solution. The login is working and the pages are appearing correctly. Also, echo $title."<br />"; is not returning any data either.
preg_match('%<title>(.*?)< /title>%', $schedule, $matches0) ;
$title=$matches0[1];
echo $title."<br />";
Thanks for your help!!
preg_match('%<title>(.*?)<
$title=$matches0[1];
echo $title."<br />";
Thanks for your help!!
ASKER
8 (continuous working on it) Hours Later... I started completely over, rewrote the script and got it to work!!
<?php
$ch = curl_init();
curl_setopt($ch, CURLOPT_COOKIEJAR, "..cookie.txt");
curl_setopt($ch, CURLOPT_URL,"https://www.mywebsite.com/signin.asp?");
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, "email=myemail&password=mypassword&submitBtn");
ob_start(); // prevent any output
curl_exec ($ch); // execute the curl command
ob_end_clean(); // stop preventing output
curl_close ($ch);
unset($ch);
$ch = curl_init();
curl_setopt($ch, CURLOPT_RETURNTRANSFER,1);
curl_setopt($ch, CURLOPT_COOKIEFILE, "../cookie.txt");
curl_setopt($ch, CURLOPT_URL,"http://www.mywebsite.com/product.asp?sku=03170");
$buf2 = curl_exec ($ch);
curl_close ($ch);
//echo "<PRE>".htmlentities($buf2);
//echo $buf2;
preg_match('%<title>(.*?)<\/title>%',$buf2,$matches0);
$title=$matches0[1];
echo $title."<br />";
?>
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Thanks!