XML Parse

Hello,

I am using a script to parse an rss feed and insert it into my db.

I am accepting google formatted feeds but I am running into a problem and need some help.

            $description      = $RSSitem->description; <<< WORKS

            $ext_color       = $RSSitem->g:color;  <<<< DOES NOT WORK

They have a g: in front of some of the values and the code errors, can you please let me know what is wrong.

Thank you,
Matt
LVL 1
movieprodwAsked:
Who is Participating?
 
Ray PaseurConnect With a Mentor Commented:
Check the code near line 70 - I put some whitespace around the lines I changed.
<?php
$db_hostname="localhost";
$db_username="vv";
$db_password="vv";

$private_access_key="5";

$client_id = $_GET['client_id'];

if(isset($_GET['feed_url']))
{
	$feed_url = $_GET['feed_url'];
}
else
{
	die("Need to pass the (consistent) 'feed url'");
}


if(isset($_GET['access_key']))
{

	if($_GET['access_key']==$private_access_key)
	{
		echo "Access key correct, proceeding...<br/><br/>";
	}
	else
	{
		die("wrong access key");
	}
}
else
{
	die("Need to pass the 'access_key' URL parameter");
}

if(isset($_GET['client_id']))
{

	if($_GET['client_id'])
	{
		echo "Client ID accepted...<br/><br/>";
	}
	else
	{
		die("no client id");
	}
}
else
{
	die("Need to pass the 'access_key' URL parameter");
}

try
{
	$db = mysql_connect($db_hostname,$db_username,$db_password);
	if (!$db)
	{
		die("Could not connect: " . mysql_error());
	}
	mysql_select_db("vv", $db);

	echo "Starting to work with feed URL " . $feed_url . "<br /><br />";

	libxml_use_internal_errors(true);
	
	
	
	
	
	$rss = file_get_contents($feed_url);
	$rss = str_replace('g:', 'g_', $rss);
	$RSS_DOC = simpleXML_load_string($rss);
	
	
	
	
	
	if (!$RSS_DOC) {
		echo "Failed loading XML\n";
		foreach(libxml_get_errors() as $error) {
			echo "\t", $error->message;
		}
	}

	$rss_title = $RSS_DOC->channel->title;
	$rss_link = $RSS_DOC->channel->link;
	$rss_editor = $RSS_DOC->channel->managingEditor;
	$rss_copyright = $RSS_DOC->channel->copyright;
	$rss_date = $RSS_DOC->channel->pubDate;

	foreach($RSS_DOC->channel->item as $RSSitem)
	{
		$item_id 	= $RSSitem->guid;
		$item_title 	= $RSSitem->title;
		$ext_color 	= $RSSitem->{"g_color"};
		$description 	= $RSSitem->description;
		$make 	= $RSSitem->{"g_make"};
		$mileage 	= $RSSitem->{"g_mileage"};
		$model 	= $RSSitem->{"g_model"};
		$price 	= $RSSitem->{"g_price"};
		$price_type 	= $RSSitem->{"g_price_type"};
		$vin 	= $RSSitem->{"g_vin"};
		$year = $RSSitem->year;
		$item_date  = date("m/d/Y");
		$item_url	= $RSSitem->link;

		echo "Processing item '" , $item_id , "' on " , $item_date 	, "<br/>";
		echo $item_title, " - ";


		$item_exists_sql = "SELECT item_id FROM vehicle_data where item_id = '" . $item_id . "' AND member_id = $client_id";
		$item_exists = mysql_query($item_exists_sql, $db);
		if(mysql_num_rows($item_exists)<1)
		{
			echo "<font color=green>Inserting new item..</font><br/>";
			$item_insert_sql = "INSERT INTO vehicle_data(item_id, feed_url, member_id, ext_color, descint, make, mileage, model, price, vin, year, status, listing_plan, date_listed) VALUES ('" . $item_id . "', '" . $feed_url . "', '" . $client_id . "', '" . $ext_color . "', '" . $description . "', '" . $make . "', '" . $mileage . "', '" . $model . "', '" . $price . "', '" . $vin . "', '" . $year . "', '0', '3', '" . $item_date . "')";
			$insert_item = mysql_query($item_insert_sql, $db);
			
			$vid2 = mysql_insert_id();
			$vid = $vid2*2+55;
			
			$sql = "UPDATE vehicle_data SET vid='$vid' where id = '$vid2' ";
			
			mysql_query($sql) 
			or die(mysql_error());  
		}
		else
		{
			echo "<font color=blue>Item Exists, Updated Price..</font><br/>";
		}

		echo "<br/>";
	}

	// End of form //
} catch (Exception $e)
{
    echo 'Caught exception: ',  $e->getMessage(), "\n";
}
?>

Open in new window

0
 
leakim971PluritechnicianCommented:
Use :

$ext_color = $RSSitem->{"g:color"};

Open in new window

0
 
movieprodwAuthor Commented:

 I tried this  $ext_color       = $RSSitem->{"g:color"};

and it did not produce any information, when I tested by typing 'test' in the $ext_color = 'test' it inserted that

here is the actual rss output

<g:color>Graphite metallic</g:color>
0
Free Tool: Path Explorer

An intuitive utility to help find the CSS path to UI elements on a webpage. These paths are used frequently in a variety of front-end development and QA automation tasks.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

 
Ray PaseurCommented:
Please post your test data.  Thanks.
0
 
movieprodwAuthor Commented:

<?php

$db_hostname="localhost";
$db_username="h";
$db_password="F";

$private_access_key="5";

$client_id = $_GET['client_id'];

// Check a few bits and pieces

if(isset($_GET['feed_url']))
{
	$feed_url = $_GET['feed_url'];
}
else
{
	die("Need to pass the (consistent) 'feed url'");
}


if(isset($_GET['access_key']))
{

	if($_GET['access_key']==$private_access_key)
	{
		echo "Access key correct, proceeding...<br/><br/>";
	}
	else
	{
		die("wrong access key");
	}
}
else
{
	die("Need to pass the 'access_key' URL parameter");
}


try
{
	/*  query the database */
	// $db = getCon();

	$db = mysql_connect($db_hostname,$db_username,$db_password);
	if (!$db)
	{
		die("Could not connect: " . mysql_error());
	}
	mysql_select_db("highline_db", $db);

	echo "Starting to work with feed URL '" . $feed_url . "'";

	/* Parse XML from  http://www.instapaper.com/starred/rss/580483/qU7TKdkHYNmcjNJQSMH1QODLc */
	//$RSS_DOC = simpleXML_load_file('http://www.instapaper.com/starred/rss/580483/qU7TKdkHYNmcjNJQSMH1QODLc');

	libxml_use_internal_errors(true);
	$RSS_DOC = simpleXML_load_file($feed_url);
	if (!$RSS_DOC) {
		echo "Failed loading XML\n";
		foreach(libxml_get_errors() as $error) {
			echo "\t", $error->message;
		}
	}


	/* Get title, link, managing editor, and copyright from the document  */
	$rss_title = $RSS_DOC->channel->title;
	$rss_link = $RSS_DOC->channel->link;
	$rss_editor = $RSS_DOC->channel->managingEditor;
	$rss_copyright = $RSS_DOC->channel->copyright;
	$rss_date = $RSS_DOC->channel->pubDate;

	//Loop through each item in the RSS document

	foreach($RSS_DOC->channel->item as $RSSitem)
	{

		$item_id 	= $RSSitem->guid;
		$ext_color 	= $RSSitem->{"g:color"};
		$description 	= $RSSitem->description;
		$make 	= $RSSitem->{"g:make"};
		$mileage 	= $RSSitem->{"g:mileage"};
		$model 	= $RSSitem->{"g:model"};
		$price 	= $RSSitem->{"g:price"};
		$price_type 	= $RSSitem->{"g:price_type"};
		$vin 	= $RSSitem->{"g:vin"};
		$year = $RSSitem->year;
		$item_date  = date("m/d/Y", strtotime($RSSitem->pubDate));
		$item_url	= $RSSitem->link;

		echo "Processing item '" , $item_id , "' on " , $fetch_date 	, "<br/>";
		echo $item_title, " - ";
		echo $item_date, "<br/>";
		echo $item_url, "<br/>";

		// Does record already exist? Only insert if new item...

		$item_exists_sql = "SELECT item_id FROM vehicle_data_copy where item_id = '" . $item_id . "' AND member_id = $client_id";
		$item_exists = mysql_query($item_exists_sql, $db);
		if(mysql_num_rows($item_exists)<1)
		{
			echo "<font color=green>Inserting new item..</font><br/>";
			$item_insert_sql = "INSERT INTO vehicle_data_copy(item_id, feed_url, member_id, ext_color, descint, make, mileage, model, price, vin, year) VALUES ('" . $item_id . "', '" . $feed_url . "', '" . $client_id . "', '" . $ext_color . "', '" . $description . "', '" . $make . "', '" . $mileage . "', '" . $model . "', '" . $price . "', '" . $vin . "', '" . $year . "')";
			$insert_item = mysql_query($item_insert_sql, $db);
		}
		else
		{
			echo "<font color=blue>Not inserting existing item..</font><br/>";
		}

		echo "<br/>";
	}

	// End of form //
} catch (Exception $e)
{
    echo 'Caught exception: ',  $e->getMessage(), "\n";
}
?>

Open in new window

0
 
Ray PaseurCommented:
Code snippet outputs this:

Warning: simplexml_load_string() [function.simplexml-load-string]: namespace error : Namespace prefix g on color is not defined in /home/websitet/public_html/RAY_temp.php on line 5
Warning: simplexml_load_string() [function.simplexml-load-string]: <g:color>Graphite metallic</g:color> in /home/websitet/public_html/RAY_temp.php on line 5
Warning: simplexml_load_string() [function.simplexml-load-string]: ^ in /home/websitet/public_html/RAY_temp.php on line 5
object(SimpleXMLElement)#1 (1) { [0]=> string(17) "Graphite metallic" }

So it appears that in spite of the namespace warnings, the object contains the data.  Once I see what you're working with, I can probably show you a way around the issues.
<?php
error_reporting(E_ALL);

$str = '<g:color>Graphite metallic</g:color>';
$obj = SimpleXML_Load_String($str);
var_dump($obj);

Open in new window

0
 
movieprodwAuthor Commented:
works on everything but the values with the g:
0
 
movieprodwAuthor Commented:
Ray that is interesting, I am excited to see what you come up with for a solution, I have no logic in this one and greatly appreciate you looking into it.
0
 
Ray PaseurCommented:
No namespace issues here.  Where is some test data that illustrates the issue?
<?php // RAY_temp_movieprodw.php
error_reporting(E_ALL);
echo "<pre>";

// THE URL?
$url = 'http://www.instapaper.com/starred/rss/580483/qU7TKdkHYNmcjNJQSMH1QODLc';

$str = file_get_contents($url);
$obj = SimpleXML_Load_String($str);
var_dump($obj);

Open in new window

0
 
movieprodwAuthor Commented:
Ray I have error reporting on and there are no errors that show up and it just does not insert into the db, I am not sure what else I can tell you that would help.
0
 
Ray PaseurCommented:
Please see line 6 of the code snippet I last posted, above.  Please verify that URL.  If it is not exactly right, please post the correct URL.  Thanks.
0
 
movieprodwAuthor Commented:
No the correct address is

Hi line-au tos .com / rss2.php

Remove the spaces, sorry but I don't want it to get indexed
0
 
Ray PaseurCommented:
Don't worry - it won't get indexed.  It does not exist.
0
 
movieprodwAuthor Commented:
I am so sorry Ray, it is highline not hiline
0
 
Ray PaseurCommented:
No namespace issues there.  Names changed.

outputs:
object(SimpleXMLElement)#1 (2) {
  ["@attributes"]=>
  array(1) {
    ["version"]=>
    string(3) "2.0"
  }
  ["channel"]=>
  object(SimpleXMLElement)#2 (4) {
    ["title"]=>
    string(15) "notHighline Autos."
    ["description"]=>
    string(30) "A description of your content."
    ["link"]=>
    string(29) "http://www.notHighline-autos.com"
    ["item"]=>
    object(SimpleXMLElement)#3 (4) {
      ["title"]=>
      string(13) "2007 BMW 328i"
      ["description"]=>
      string(353) "2007 BMW 328i- Graphite metallic over black leather, 6-cylinder, automatic trans, 30,966 miles, wood grain interior trim, leather wrapped multi-function steering wheel, tire pressure monitor, push button start, rain sensor windshield, power memory seats, premium sound, CD, moonroof, dual zone climate with rear vents, alloy wheels. $19,998 Stock #74002"
      ["guid"]=>
      string(4) "4439"
      ["link"]=>
      string(48) "http://www.notHighline-autos.com/search_detail/4439"
    }
  }
}

<?php // RAY_temp_movieprodw.php
error_reporting(E_ALL);
echo "<pre>";

// THE URL?
$url = 'http://notHighline-autos.com/rss2.php';

$str = file_get_contents($url);
$obj = SimpleXML_Load_String($str);
var_dump($obj);

Open in new window

0
 
movieprodwAuthor Commented:
Yes but if you look at the actual xml feed there is more data then that, it is ignoring all of the g: fields which are important to me
0
 
movieprodwAuthor Commented:
0
 
Ray PaseurCommented:
Thanks - I see the issue now.  This may look a little "hokey" but it will work.  Instead of str_replace() you might create a REGEX if you think there is a risk that you would have "g:" somewhere in the data values.
<?php // RAY_temp_movieprodw.php
error_reporting(E_ALL);
echo "<pre>";

// THE URL?
$url = 'http://NotHighline-autos.com/rss2.php';

$str = file_get_contents($url);

// MUNG THE STRING TO MAKE ALL FIELDS USABLE
$str = str_replace('g:', 'g_', $str);

$obj = SimpleXML_Load_String($str);
var_dump($obj);

Open in new window

0
 
movieprodwAuthor Commented:
Ray that makes sense!

I tried this but it did not work

      $RSS_DOC = simpleXML_load_file($feed_url);
      $RSS_DOC = str_replace('g:', 'g_', $RSS_DOC);
0
 
Ray PaseurCommented:
You have to load the string with file_get_contents() and use the str_replace() on the string.  Then load the XML object from the string.
0
 
movieprodwAuthor Commented:
Okay, I can grasp that but am not sure what I am doing wrong, this is not working, can you point me in what I am doing wrong.

Thank you

      $str = file_get_contents($url);
      
      $str = str_replace('g:', 'g_', $str);
      
      libxml_use_internal_errors(true);
      $RSS_DOC = SimpleXML_Load_String($str);
      if (!$RSS_DOC) {
            echo "Failed loading XML\n";
            foreach(libxml_get_errors() as $error) {
                  echo "\t", $error->message;
            }
      }
0
 
Ray PaseurCommented:
Please post the entire code in the code snippet.  You can obscure the URL name if you want, but please leave everything else intact. I need to be able to test it exactly the way you are running it so that I can understand what is not working.  Thanks, ~Ray
0
 
movieprodwAuthor Commented:
Here you go Ray.
<?php
$db_hostname="localhost";
$db_username="vv";
$db_password="vv";

$private_access_key="5";

$client_id = $_GET['client_id'];

if(isset($_GET['feed_url']))
{
	$feed_url = $_GET['feed_url'];
}
else
{
	die("Need to pass the (consistent) 'feed url'");
}


if(isset($_GET['access_key']))
{

	if($_GET['access_key']==$private_access_key)
	{
		echo "Access key correct, proceeding...<br/><br/>";
	}
	else
	{
		die("wrong access key");
	}
}
else
{
	die("Need to pass the 'access_key' URL parameter");
}

if(isset($_GET['client_id']))
{

	if($_GET['client_id'])
	{
		echo "Client ID accepted...<br/><br/>";
	}
	else
	{
		die("no client id");
	}
}
else
{
	die("Need to pass the 'access_key' URL parameter");
}

try
{
	$db = mysql_connect($db_hostname,$db_username,$db_password);
	if (!$db)
	{
		die("Could not connect: " . mysql_error());
	}
	mysql_select_db("vv", $db);

	echo "Starting to work with feed URL " . $feed_url . "<br /><br />";

	libxml_use_internal_errors(true);
	$RSS_DOC = simpleXML_load_file($feed_url);
	if (!$RSS_DOC) {
		echo "Failed loading XML\n";
		foreach(libxml_get_errors() as $error) {
			echo "\t", $error->message;
		}
	}

	$rss_title = $RSS_DOC->channel->title;
	$rss_link = $RSS_DOC->channel->link;
	$rss_editor = $RSS_DOC->channel->managingEditor;
	$rss_copyright = $RSS_DOC->channel->copyright;
	$rss_date = $RSS_DOC->channel->pubDate;

	foreach($RSS_DOC->channel->item as $RSSitem)
	{
		$item_id 	= $RSSitem->guid;
		$item_title 	= $RSSitem->title;
		$ext_color 	= $RSSitem->{"g_color"};
		$description 	= $RSSitem->description;
		$make 	= $RSSitem->{"g_make"};
		$mileage 	= $RSSitem->{"g_mileage"};
		$model 	= $RSSitem->{"g_model"};
		$price 	= $RSSitem->{"g_price"};
		$price_type 	= $RSSitem->{"g_price_type"};
		$vin 	= $RSSitem->{"g_vin"};
		$year = $RSSitem->year;
		$item_date  = date("m/d/Y");
		$item_url	= $RSSitem->link;

		echo "Processing item '" , $item_id , "' on " , $item_date 	, "<br/>";
		echo $item_title, " - ";


		$item_exists_sql = "SELECT item_id FROM vehicle_data where item_id = '" . $item_id . "' AND member_id = $client_id";
		$item_exists = mysql_query($item_exists_sql, $db);
		if(mysql_num_rows($item_exists)<1)
		{
			echo "<font color=green>Inserting new item..</font><br/>";
			$item_insert_sql = "INSERT INTO vehicle_data(item_id, feed_url, member_id, ext_color, descint, make, mileage, model, price, vin, year, status, listing_plan, date_listed) VALUES ('" . $item_id . "', '" . $feed_url . "', '" . $client_id . "', '" . $ext_color . "', '" . $description . "', '" . $make . "', '" . $mileage . "', '" . $model . "', '" . $price . "', '" . $vin . "', '" . $year . "', '0', '3', '" . $item_date . "')";
			$insert_item = mysql_query($item_insert_sql, $db);
			
			$vid2 = mysql_insert_id();
			$vid = $vid2*2+55;
			
			$sql = "UPDATE vehicle_data SET vid='$vid' where id = '$vid2' ";
			
			mysql_query($sql) 
			or die(mysql_error());  
		}
		else
		{
			echo "<font color=blue>Item Exists, Updated Price..</font><br/>";
		}

		echo "<br/>";
	}

	// End of form //
} catch (Exception $e)
{
    echo 'Caught exception: ',  $e->getMessage(), "\n";
}
?>

Open in new window

0
 
Ray PaseurCommented:
I will also need the value for this variable:
$feed_url
0
 
movieprodwAuthor Commented:
Perfect!
0
 
Ray PaseurCommented:
Thanks for the points - it's a great question. ~ray
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.