[Okta Webinar] Learn how to a build a cloud-first strategyRegister Now

x
?
Solved

parse xml feed into mysql database using php

Posted on 2009-04-25
19
Medium Priority
?
1,706 Views
Last Modified: 2012-05-06
I have been trying to parse a xml product feed into mysql database using php sacript. I had encountered error messages at first but now I just get a blank screen when i execute the script and nothing writes into the database...The xml file was initially 92mb so the script was timing out after 30 seconds. I have now made it smaller but it still won't write to the database
<?php require_once ('connect.php'); 
db_connect();
 
$xmlReader = new XMLReader();
 
$filename = "getmein.xml";
$url = "http://www.icanget.co.uk/getmein.xml";
file_put_contents($filename, file_get_contents($url));
 
$xmlReader->open($filename); 
 
while ($xmlReader->read()) 
{
}
switch ($xmlReader->name)
{
case 'product':
 
$dom = new DOMDocument();
$domNode = $xmlReader->expand();
$element = $dom->appendChild(domNode);
$domstring = utf8_encode($dom->saveXML($element));
$product = new SimpleXMLElement($domString);
 
$product_code = $product->product_code;
$product_name = $product->product_name;
$level1 = $product->level1;
$level2 = $product->level2;
$description = $product->description;
$buyat_short_deeplink_url = $product->buyat_short_deeplink_url;
 
$image_url = $product->image_url;
$currency = $product->currency;
$price = $product->online_price;
$city = $product->city;
$country = $product->country;
$event_date = $product->event_date;
$event_venue = $product->event_venue;
$number_available = $product->number_available;
 
if (strlen($product_code)> 0)
{
$query = mysql_query("REPLACE INTO icanget_tickets
(product_code, product_name, level1, level2, description, buyat_short_deeplink_url, image_url, currency, price, country, event_date, event_venue, number_available)
 
VALUES ('$product_code', '$product_name', '$level1', '$level2', '$description', '$buyat_short_deeplink_url', '$image_url', '$currency', '$price', '$country', '$event_date', '$event_venue', '$number_available')");
 
echo $product_name . "has been inserted </br>";
}
break;
 
}
 
?>

Open in new window

0
Comment
Question by:rafique12
  • 10
  • 9
19 Comments
 
LVL 39

Expert Comment

by:Roger Baklund
ID: 24233945
There are a couple of bugs here:

Line 14, this } should not be here, it should be at the end of the script.

Line 21, change domNode into $domNode.

Line 23, change $domString into $domstring (or change it in line 22, variable names are case sensitive).

Add error checking at line 47:

if(!$query) die(mysql_error());
0
 

Author Comment

by:rafique12
ID: 24234985
I have made the changes to the script but i still get a blank screen and the data doesn't write into the database.
<?php require_once ('connect.php'); 
db_connect();
 
$xmlReader = new XMLReader();
 
$filename = "getmein.xml";
$url = "http://www.icanget.co.uk/getmein.xml";
file_put_contents($filename, file_get_contents($url));
 
$xmlReader->open($filename); 
 
while ($xmlReader->read()) 
{
 
switch ($xmlReader->name)
{
case 'product':
 
$dom = new DOMDocument();
$domNode = $xmlReader->expand();
$element = $dom->appendChild($domNode);
$domString = utf8_encode($dom->saveXML($element));
$product = new SimpleXMLElement($domString);
 
$product_code = $product->product_code;
$product_name = $product->product_name;
$level1 = $product->level1;
$level2 = $product->level2;
$description = $product->description;
$buyat_short_deeplink_url = $product->buyat_short_deeplink_url;
 
$image_url = $product->image_url;
$currency = $product->currency;
$price = $product->online_price;
$city = $product->city;
$country = $product->country;
$event_date = $product->event_date;
$event_venue = $product->event_venue;
$number_available = $product->number_available;
 
if (strlen($product_code)> 0)
{
$query = mysql_query("REPLACE INTO icanget_tickets
(product_code, product_name, level1, level2, description, buyat_short_deeplink_url, image_url, currency, price, country, event_date, event_venue, number_available)
 
VALUES ('$product_code', '$product_name', '$level1', '$level2', '$description', '$buyat_short_deeplink_url', '$image_url', '$currency', '$price', '$country', '$event_date', '$event_venue', '$number_available')");
if(!$query) die(mysql_error()); 
echo $product_name . "has been inserted </br>";
}
break;
 
}
 
}
 
?>

Open in new window

0
 

Author Comment

by:rafique12
ID: 24234998
I am actually getting error messages now
Warning: XMLReader::read() [xmlreader.read]: /home/icanget/public_html/getmein.xml:91: parser error : EntityRef: expecting ';' in /home/icanget/public_html/getmein.php on line 12
 
Warning: XMLReader::read() [xmlreader.read]: <td>http://ticketsales.at/rafique?CTY=9&DURL=http://www.getmein.com/ti in /home/icanget/public_html/getmein.php on line 12
 
Warning: XMLReader::read() [xmlreader.read]: ^ in /home/icanget/public_html/getmein.php on line 12
 
Warning: XMLReader::read() [xmlreader.read]: An Error Occured while reading in /home/icanget/public_html/getmein.php on line 12

Open in new window

0
Microsoft Certification Exam 74-409

Veeam® is happy to provide the Microsoft community with a study guide prepared by MVP and MCT, Orin Thomas. This guide will take you through each of the exam objectives, helping you to prepare for and pass the examination.

 
LVL 39

Expert Comment

by:Roger Baklund
ID: 24236375
It is because the XML is malformed. The & character is not allowed, it should be written as &amp; like this: "rafique?CTY=9&amp;DURL="
0
 
LVL 39

Expert Comment

by:Roger Baklund
ID: 24236384
Actually, this url: "http://www.icanget.co.uk/getmein.xml" is currently outputing a html page. It did output XML yesterday, I tested it.
0
 

Author Comment

by:rafique12
ID: 24236650
Okay this is getting more and more confusing...I have a php script and an xml page... I am trying to use my php script to send data into mysql from an xml feed I have altered the url in my xml page and i get an error message when i execute that page. I also get the above error messages when i execute the php script and above all nothing gets wtitten in to mysql. PLEAS I JUST NEED TO KNOW WHAT IS WRONG SO I CAN FIX IT :-(
error on line 91 at column 55: EntityRef: expecting ';'
there is no line 91

Open in new window

0
 
LVL 39

Expert Comment

by:Roger Baklund
ID: 24236686
Where is the XML? It is not at "http://www.icanget.co.uk/getmein.xml" anymore.

This is line 91 of the above url:

<td>http://ticketsales.at/rafique?CTY=9&DURL=http://www.getmein.com/tickets/duran-duran-tickets/edinburgh-123582.html?cid=gmibuy</td>

The error means that the & character is illegal. You can not have a & character in xml, it must be written as "&amp;". This is known as a character entity reference. The parser did not find a closing ; character after &DURL. It must be written as "&amp;DURL".

http://en.wikipedia.org/wiki/List_of_XML_and_HTML_character_entity_references
0
 

Author Comment

by:rafique12
ID: 24236734
There is the code from the xml file. I have in fact changed the & to &amp; but still getting errors
<?xml version="1.0"?>
<product>
<product_code>602361</product_code>
<product_name>2 Many DJs in London, United Kingdom</product_name>
<level1>Dance and Electronic</level1>
<level2>2 Many DJs</level2>
<description>Formed: 1995 in Ghent, Belgium Style: Electro-House / Mesh-up Check out: ?As Heard on Radio Soulwax Pt. 2? 2 Many DJ&#039;s...</description>
<buyat_short_deeplink_url>http://ticketsales.at/rafique/602361.html</buyat_short_deeplink_url>
<old_style_deeplink_url>http://ticketsales.at/rafique?CTY=9&amp;DURL=http://www.getmein.com/tickets/2-many-djs-tickets/london-112680.html?cid=gmibuy</old_style_deeplink_url>
<image_url>http://www.getmein.com/DynamicImages/affiliates/images/category_6.jpg</image_url>
<currency>GBP</currency>
<online_price>108.90</online_price>
<city>London</city>
<country>United Kingdom</country>
<event_date>2009-06-13</event_date><event_time>21:00:00</event_time>
<event_venue>O2 Academy Brixton</event_venue>
<number_available>6</number_available>
</product>

Open in new window

0
 
LVL 39

Expert Comment

by:Roger Baklund
ID: 24236794
Please show the errors you get with this file.
0
 

Author Comment

by:rafique12
ID: 24236927
Here's the error for the getmein.xml file let me know if you wish to view the error message for the getmein.php file
This page contains the following errors:
 
error on line 13 at column 15: Entity 'nbsp' not defined
error on line 25 at column 17: Entity 'copy' not defined
error on line 28 at column 13: Entity 'nbsp' not defined
Below is a rendering of the page up to the first error.
 
buy.at feeds The following problems were encountered during your request. PX is a required parameter DISPLAYFORMAT is a required parameter DISPLAYFORMAT must be one of: XMLGZIP XML CSVGZIP CSV PIPEGZIP PIPE SCSVGZIP SCSV SIMPLECSV HEAD SAMPLE 2009 Perfiliate Technologies Limited

Open in new window

0
 
LVL 39

Expert Comment

by:Roger Baklund
ID: 24237075
The errors means the xml contains "&nbsp;" and "&copy;". These are html character entity references, they are not valid in XML. These are not found in the XML you showed above. Does your code work with the xml you showed above?
0
 

Author Comment

by:rafique12
ID: 24237117
Nothjng is working. I am about to start entering the data manually. It's 92mb in size so I'll be here until the Olympics start...I have no idea what i a doing wrong
0
 
LVL 39

Expert Comment

by:Roger Baklund
ID: 24237169
:)

It works for me. I copied your script and your XML snippet from above, and produced this sql:
REPLACE INTO icanget_tickets (product_code, product_name, level1, level2, description, buyat_short_deeplink_url, image_url, currency, price, country, event_date, event_venue, number_available) VALUES ('602361', '2 Many DJs in London, United Kingdom', 'Dance and Electronic', '2 Many DJs', 'Formed: 1995 in Ghent, Belgium Style: Electro-House / Mesh-up Check out: ?As Heard on Radio Soulwax Pt. 2? 2 Many DJ's...', 'http://ticketsales.at/rafique/602361.html', 'http://www.getmein.com/DynamicImages/affiliates/images/category_6.jpg', 'GBP', '108.90', 'United Kingdom', '2009-06-13', 'O2 Academy Brixton', '6')

Open in new window

0
 
LVL 39

Expert Comment

by:Roger Baklund
ID: 24237178
This is the code I used, it is unchanged, except the xml is read from the file 'Product.xml' and the SQL is echoed instead of executed on the server:
<?php 
#require_once ('connect.php'); 
#db_connect();
 
$xmlReader = new XMLReader();
 
$filename = "Product.xml";
#$url = "http://www.icanget.co.uk/getmein.xml";
#file_put_contents($filename, file_get_contents($url));
 
$xmlReader->open($filename); 
 
while ($xmlReader->read()) 
{
 
switch ($xmlReader->name)
{
case 'product':
 
$dom = new DOMDocument();
$domNode = $xmlReader->expand();
$element = $dom->appendChild($domNode);
$domString = utf8_encode($dom->saveXML($element));
$product = new SimpleXMLElement($domString);
 
$product_code = $product->product_code;
$product_name = $product->product_name;
$level1 = $product->level1;
$level2 = $product->level2;
$description = $product->description;
$buyat_short_deeplink_url = $product->buyat_short_deeplink_url;
 
$image_url = $product->image_url;
$currency = $product->currency;
$price = $product->online_price;
$city = $product->city;
$country = $product->country;
$event_date = $product->event_date;
$event_venue = $product->event_venue;
$number_available = $product->number_available;
 
if (strlen($product_code)> 0)
{
echo "REPLACE INTO icanget_tickets
(product_code, product_name, level1, level2, description, buyat_short_deeplink_url, image_url, currency, price, country, event_date, event_venue, number_available)
 
VALUES ('$product_code', '$product_name', '$level1', '$level2', '$description', '$buyat_short_deeplink_url', '$image_url', '$currency', '$price', '$country', '$event_date', '$event_venue', '$number_available')";
#if(!$query) die(mysql_error()); 
echo $product_name . "has been inserted </br>";
}
break;
 
}
 
}
 
?>

Open in new window

0
 

Author Comment

by:rafique12
ID: 24237238
When this script is executed i need it to write the products in to mysql. I have a script which will then echo the contents onto a webpage. The problem i am having is getting this script to send the product feed to mysql. I must point out to you the the entire product feed is 92mb...The file getmein.xml is a bit sized snippet of the product feed
0
 
LVL 39

Expert Comment

by:Roger Baklund
ID: 24237747
Your php script sems to be working, at least it does produce a REPLACE INTO sql statement. Connecting to the database and executing the statement is trivial.

The xml feed seems to contain errors. This is a different issue. The xml must be correct or the php XMLReader will not be able to parse it.

Do you have control of the production of the xml feed? Can you post an url to the xml feed?
0
 

Author Comment

by:rafique12
ID: 24239655
I don't have control over the production of the xml feed i simple download it from the provider. I have attached the url for you to examine.
https://feeds.perfb.com/index.php/download?OEMAIL=danielrafique@vyceversa.co.uk&PX=d0310005b5e00cc254c6bc3c1fe075f0&DISPLAYFORMAT=HEAD&REVERSEMAPXML=yes&PRODUCTDB_ID=357

Open in new window

0
 
LVL 39

Accepted Solution

by:
Roger Baklund earned 2000 total points
ID: 24240104
This url delivers a html4 page, not an xml feed.
0
 

Author Comment

by:rafique12
ID: 24277930
Okay I am sure this is all correct now. The Xml file is Get_Me_In.xml but I know get a Error 320 (net::ERR_INVALID_RESPONSE): Unknown error. when the script is executed
0

Featured Post

Free Tool: IP Lookup

Get more info about an IP address or domain name, such as organization, abuse contacts and geolocation.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Originally, this post was published on Monitis Blog, you can check it here . In business circles, we sometimes hear that today is the “age of the customer.” And so it is. Thanks to the enormous advances over the past few years in consumer techno…
The title says it all. Writing any type of PHP Application or API code that provides high throughput, while under a heavy load, seems to be an arcane art form (Black Magic). This article aims to provide some general guidelines for producing this typ…
The viewer will learn how to dynamically set the form action using jQuery.
In this video, Percona Solution Engineer Dimitri Vanoverbeke discusses why you want to use at least three nodes in a database cluster. To discuss how Percona Consulting can help with your design and architecture needs for your database and infras…
Suggested Courses
Course of the Month20 days, 3 hours left to enroll

873 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question