XML Parsing in MySQL

Posted on 2010-08-06
Last Modified: 2012-05-10

We are planning for a database migration from MS SQL Server to MySQL. Most of our procedures in SQL Server are working with XML data, but we are issues where trying to convert these procedures to MySQL.

Below is the sample XML file. We are sending XML data (not the XML file) directly as a input to the procedure.

With the XML we want the data in the following tabular format in MySQL

Date | Attendance | Personal_Id


Attendance | Personal_Id

Can any experts help us to do this in MySQL.

I tried with ExtractValue but with no help. May be I am not using it properly.

Versions of MySQL we tried for the above: 5.1 & 5.5

Thanks in Advance.
Question by:Jagdish Devaku
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 7
  • 6
  • 6
  • +1
LVL 14

Expert Comment

ID: 33374623
LVL 14

Expert Comment

ID: 33374656
Another option is to use the perl script as follow
LVL 14

Expert Comment

ID: 33374664
Also refer the PHP scripting to do the same
<?php require_once ('connect.php'); 
$xmlReader = new XMLReader();
$filename = "getmein.xml";
$url = "";
file_put_contents($filename, file_get_contents($url));
while ($xmlReader->read()) 
switch ($xmlReader->name)
case 'product':
$dom = new DOMDocument();
$domNode = $xmlReader->expand();
$element = $dom->appendChild(domNode);
$domstring = utf8_encode($dom->saveXML($element));
$product = new SimpleXMLElement($domString);
$product_code = $product->product_code;
$product_name = $product->product_name;
$level1 = $product->level1;
$level2 = $product->level2;
$description = $product->description;
$buyat_short_deeplink_url = $product->buyat_short_deeplink_url;
$image_url = $product->image_url;
$currency = $product->currency;
$price = $product->online_price;
$city = $product->city;
$country = $product->country;
$event_date = $product->event_date;
$event_venue = $product->event_venue;
$number_available = $product->number_available;
if (strlen($product_code)> 0)
$query = mysql_query("REPLACE INTO icanget_tickets
(product_code, product_name, level1, level2, description, buyat_short_deeplink_url, image_url, currency, price, country, event_date, event_venue, number_available)
VALUES ('$product_code', '$product_name', '$level1', '$level2', '$description', '$buyat_short_deeplink_url', '$image_url', '$currency', '$price', '$country', '$event_date', '$event_venue', '$number_available')");
echo $product_name . "has been inserted </br>";

Open in new window

The Orion Papers

Are you interested in becoming an AWS Certified Solutions Architect?

Discover a new interactive way of training for the exam.

LVL 14

Author Comment

by:Jagdish Devaku
ID: 33374802
Thanks for the responses.

I am not looking for PHP or Perl scripting.

I need pure MySQL script to do this.

@kalmax: I have already gone through that link before, but i could not get any solution...

Any more helps...
LVL 14

Expert Comment

ID: 33374949
*MySQL XML import test drive*
Here is a quick test drive for the load xml command, just to show you how it

*Creating the test tables*

Start by creating the test table with the following command:

create table test.person(
person_id int not null,
fname varchar(40) null,
lname varchar(40) null,
created timestamp,
primary key(person_id));

Now we are ready for some testing!

*First test: data as attributes*

For this test we use file person.xml, with the following content:


<person person_id='1' fname="Erik" lname="Wetterberg"/>

<person person_id='2' fname="Sven" lname="Svensson"/>


To read this into MySQL, we use the following command:

mysql> load xml local infile 'person.xml' into table test.person

-> rows identified by '<person>';

Query OK, 2 rows affected (0.08 sec)

Records: 2 Deleted: 0 Skipped: 0 Warnings: 0

Check the result with the following command:

mysql> select * from test.person;


| person_id | fname | lname | created |


| 1 | Erik | Wetterberg | 2006-12-05 18:17:22 |

| 2 | Sven | Svensson | 2006-12-05 18:17:22 |


2 rows in set (0.00 sec)

LVL 14

Author Comment

by:Jagdish Devaku
ID: 33375061

Thanks Kalmax.

Actually I am not looking for Load XML... as we will be directly sending the xml string from the application and i specified in the question itself.

Anymore helps....
LVL 14

Expert Comment

ID: 33375113
not sure ...about any mysql function that would load the xml link straight to mysql database...

I suppose you would need Load function with mysql or add perl or php script to do this...


LVL 14

Author Comment

by:Jagdish Devaku
ID: 33388187
If not XML...

Is there any other way where I will be sending multiple rows of data to MySQL and its get processed in MySQL.

My application is developed in .NET.

Any helps...
LVL 45

Expert Comment

ID: 33415447
What about using the Extracvalue() function?

LVL 14

Author Comment

by:Jagdish Devaku
ID: 33416945
Hi Aikimark,

We are unable to get multiple rows using extractvalue(). So extractvalue() is not an option for my requirement. I already mentioned this in my question....

LVL 45

Expert Comment

ID: 33418917
1. part of the problem might be that you have specified UTF-16 encoding, but it lacks a unicode byte order mark (BOM).

>>We are unable to get multiple rows...
2. If you look at the linked page, you will see examples of iterating through the XML.
LVL 14

Author Comment

by:Jagdish Devaku
ID: 33419055
I know this can be done through LoadXML - but as per our scenario we can't use loadxml as we will lot of simultaneous transactions...

The solution I am looking for is not there in the linked page.

LVL 45

Expert Comment

ID: 33419095
the linked page is about using the Extractvalue() function, not LoadXML.
LVL 45

Expert Comment

ID: 33419245
Oh.  I see, your 'data' are actually attributes.
LVL 45

Accepted Solution

aikimark earned 200 total points
ID: 33419546
I have not tested the following on my own database, so you might encounter a syntax error.
SET At_Date = ExtractValue(xml_parm, '//Attendance_Info/@Date')

SET nrows = ExtractValue(xml_parm, 

SET rownum = 1

While rownum <= nrows
  SET At_Attendance = ExtractValue(xml_parm, '//Attendance_Info/Attendance[$rownum]/@Attendance')
  SET At_PersonalId = ExtractValue(xml_parm, '//Attendance_Info/Attendance[$rownum]/@PersonalId')
  SET rownum = rownum + 1
End While

Open in new window

LVL 60

Expert Comment

by:Kevin Cross
ID: 33438140
Aside from a minor type-o, 'count('count(//Attendance_Info/Attendance)', I believe the approach above by aikimark should work perfectly fine. I now have my MySQL instance reinstalled so I can test this through for you ... and mock up how it would look in stored procedure.
LVL 60

Expert Comment

by:Kevin Cross
ID: 33438160
Although, I think ExtractValue needs to have a static XPath expression, so potentially passing the $rownum will not work but will work that all out when I test.
LVL 60

Assisted Solution

by:Kevin Cross
Kevin Cross earned 300 total points
ID: 33438440

As I originally thought, aikimark's approach is a good one. The limitation I had previously encountered was with me trying to feed in a column value to use as the index for the XPath expression and you will get an error indicating 'only constant expressions are allowed' but you can get around that with user defined variables and as such this works well in stored procedures using $ like shown by aikimark.

To test, I created the following structure:

drop table if exists attendance;
create table `attendance` (
  `att_date` datetime,
  `att_attend_id` int(11),
  `att_person_id` int(11),
  `att_serial_no` int(11),
  `att_id` bigint(20) not null auto_increment,
  primary key (`att_id`)
) engine=InnoDB auto_increment=1 default charset=latin1;

Therefore, you will have to adjust to fit your own table / column information and as such may need to change some of my defaulting assumptions -- but think you will want to add some traps like that for when the value is either not present in the XML or is an empty string.

As you can see this uses aikimark's approach by grabbing the date first since that is common to all Attendance nodes. You could do the same for Flag.

As you can see with all the fields, I am using nullif(..., '') to turn empty string to null. You can insert in the table as null or use coalesce to default as shown.

Next step in Mark's approach is to figure out the total count of rows and set a variable nrows.

Because I wanted to be different I showed by decrementing nrows so you didn't have to create another variable to use as the index. However, note that the records will be imported backwards in this fashion and so if you have an auto_increment field like I showed in my example create table DDL above, you will notice that the last row has the lower auto_increment id. If this is a problem, then you should use the second variable like rowno and (rowno <= nrows) as aikimark had above. The point though is you can easily use the nrows in a while loop to in your case insert the rows to the table.

note: size of xml parameter needs to be set appropriate to your maximum xml string to be received.

To test:
set @xml = '<?xml version="1.0" encoding="UTF-16"?>';

call InsertAttendanceXmlProc( @xml );

Results in:
'2010-07-28 00:00:00', '2', '34170', '16', '1'
'2010-07-28 00:00:00', '2', '34086', '15', '2'
'2010-07-28 00:00:00', '2', '33734', '14', '3'
'2010-07-28 00:00:00', '2', '33652', '13', '4'
'2010-07-28 00:00:00', '2', '33651', '12', '5'
'2010-07-28 00:00:00', '2', '33377', '11', '6'
'2010-07-28 00:00:00', '2', '33314', '10', '7'
'2010-07-28 00:00:00', '2', '33313', '9', '8'
'2010-07-28 00:00:00', '2', '33312', '8', '9'
'2010-07-28 00:00:00', '2', '33311', '7', '10'
'2010-07-28 00:00:00', '2', '33310', '6', '11'
'2010-07-28 00:00:00', '2', '33309', '5', '12'
'2010-07-28 00:00:00', '2', '33308', '4', '13'
'2010-07-28 00:00:00', '2', '33307', '3', '14'
'2010-07-28 00:00:00', '0', '33306', '2', '15'
'2010-07-28 00:00:00', '2', '33305', '1', '16'
'2010-07-28 00:00:00', '2', '33304', '0', '17'

Please credit aikimark on the idea. I was just intrigued and wanted to help in the testing.

Hope that helps.


delimiter //

drop procedure if exists InsertAttendanceXmlProc;
create procedure InsertAttendanceXmlProc(in xml varchar(8000))
   declare dt datetime;
   declare nrows int;
   -- to be safe, explicitly convert date from xml to datetime - expects mm/dd/yyyy hh:mi:ss 
   set dt = str_to_date(coalesce(nullif(extractvalue(xml, '//Attendance_Info/@Date'), ''), now()), '%m/%d/%Y %H:%i:%s');

   set nrows = extractvalue(xml, 'count(//Attendance_Info/Attendance)');
   -- decrement nrows to save on creating another variable
   while nrows > 0 do
      insert into attendance(att_date, att_attend_id, att_person_id, att_serial_no)
      select dt
           , coalesce(nullif(extractvalue(xml, '//Attendance_Info/Attendance[$nrows]/@Attendance'), ''),0)
           , coalesce(nullif(extractvalue(xml, '//Attendance_Info/Attendance[$nrows]/@PersonalId'), ''),0)
           , coalesce(nullif(extractvalue(xml, '//Attendance_Info/Attendance[$nrows]/@SerialNumber'), ''),0);
      set nrows = nrows - 1;
   end while;


delimiter ;

Open in new window

LVL 60

Assisted Solution

by:Kevin Cross
Kevin Cross earned 300 total points
ID: 33439193

By the way, you can use XPath short cuts with some of this if your structure is always as shown while giving you a little freedom to say alter the root and element names -- for example :

'//@Date' or '/*/@Date'
'//child::*[1]/@PersonalId' or '/*/child::*[1]/@PersonalId'

Would still match properly to your xml attributes even if its appearance changes slightly.
set @xml = 
'<?xml version="1.0" encoding="UTF-16"?>
<Attendance Date="07/28/2010 00:00:00" Flag="1">
    <Info Attendance="2" PersonalId="33304" SerialNumber="0"/>
    <Info Attendance="2" PersonalId="33305" SerialNumber="1"/>
    <Info Attendance="0" PersonalId="33306" SerialNumber="2"/>
    <Info Attendance="2" PersonalId="33307" SerialNumber="3"/>
    <Info Attendance="2" PersonalId="33308" SerialNumber="4"/>
    <Info Attendance="2" PersonalId="33309" SerialNumber="5"/>
    <Info Attendance="2" PersonalId="33310" SerialNumber="6"/>
    <Info Attendance="2" PersonalId="33311" SerialNumber="7"/>
    <Info Attendance="2" PersonalId="33312" SerialNumber="8"/>
    <Info Attendance="2" PersonalId="33313" SerialNumber="9"/>
    <Info Attendance="2" PersonalId="33314" SerialNumber="10"/>
    <Info Attendance="2" PersonalId="33377" SerialNumber="11"/>
    <Info Attendance="2" PersonalId="33651" SerialNumber="12"/>
    <Info Attendance="2" PersonalId="33652" SerialNumber="13"/>
    <Info Attendance="2" PersonalId="33734" SerialNumber="14"/>
    <Info Attendance="2" PersonalId="34086" SerialNumber="15"/>
    <Info Attendance="2" PersonalId="34170" SerialNumber="16"/>
select extractvalue(@xml, '//child::*[1]/@Attendance');

Open in new window

LVL 14

Author Comment

by:Jagdish Devaku
ID: 33443332
Thanks all for your support... I have started on this... i will let you the results...

Thanks mwvisa1 for your detailed info...
LVL 60

Expert Comment

by:Kevin Cross
ID: 33443347
You are welcome. Let us know how this works out for you. I have started an article on this to help others and show a neat trick to doing this from a single select statement that can be used a standalone query or as the main statement in stored procedure. I hope to be done shortly and so will post a link here ...
LVL 60

Expert Comment

by:Kevin Cross
ID: 33443729
Finished article (at least an intro, playing with some advanced topics on the subject):

Hopefully that is useful to you in addition to future readers.
LVL 45

Expert Comment

ID: 33445385
Great article, mwvisa1.  Quickly written, too.
LVL 14

Author Closing Comment

by:Jagdish Devaku
ID: 33452656
Superb. I really appreciate the efforts by mwvisa1, aikimark & kalmax. Thanks to all.

Thanks for the excellent article by mwvisa1.

Featured Post

More Than Just A Video Library

Train for your certification. Learn the latest DevOps tools. Grow your skillset to do better work.

At Linux Academy, we release new training modules every week so you'll always be up to date on the latest tech.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

When table data gets too large to manage or queries take too long to execute the solution is often to buy bigger hardware or assign more CPUs and memory resources to the machine to solve the problem. However, the best, cheapest and most effective so…
This article shows the steps required to install WordPress on Azure. Web Apps, Mobile Apps, API Apps, or Functions, in Azure all these run in an App Service plan. WordPress is no exception and requires an App Service Plan and Database to install
Via a live example, show how to extract insert data into a SQL Server database table using the Import/Export option and Bulk Insert.
Via a live example, show how to shrink a transaction log file down to a reasonable size.

688 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question