Let's say I have a list of five restaurants with two fields, name and cuisine:
I want to make the results look like this: .
CREATE TABLE `restaurants` (
`id` INT(11) NOT NULL AUTO_INCREMENT,
`name` VARCHAR(45) DEFAULT NULL,
`cuisine` VARCHAR(100) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
(sample data -- insert statement)
INSERT INTO `restaurants`(`name`, `cuisine`)
SELECT 'restaurant 1', 'pho, vietnamese, canadian'
UNION SELECT 'restaurant 2', 'asian, pho, vietnamese, sushi'
UNION SELECT 'restaurant 3', 'asian, pho, vietnamese, sushi'
UNION SELECT 'restaurant 4', 'asian, vietnamese'
UNION SELECT 'restaurant 5', 'vietnamese, spanish, mexican, canadian'
;
With a quick selection of the above data, you can verify that your input looks like Figure 1.1.
SELECT `name`, `cuisine` FROM `restaurants`;
In addition, we will take advantage of a having a numbers utility table in our database, so run this:
CONCAT('<cuisine>', REPLACE(cuisine, ', ', '</cuisine><cuisine>'), '</cuisine>')
(resulting in something like this for example)
'<cuisine>pho</cuisine><cuisine>vietnamese</cuisine><cuisine>canadian</cuisine>'
'<r><c>vietnamese</c><c>spanish</c><c>mexican</c><c>canadian</c></r>'
SELECT `name`
, EXTRACTVALUE(xml, '//c[1]/text()') AS `cuisine1`
, EXTRACTVALUE(xml, '//c[2]/text()') AS `cuisine2`
, EXTRACTVALUE(xml, '//c[3]/text()') AS `cuisine3`
, EXTRACTVALUE(xml, '//c[4]/text()') AS `cuisine4`
, EXTRACTVALUE(xml, '//c[5]/text()') AS `cuisine5`
FROM (
SELECT *,
CONCAT('<r><c>', REPLACE(cuisine, ', ', '</c><c>'), '</c></r>') AS xml
FROM restaurants
) AS tbl
;
(results)
EXTRACTVALUE(xml_frag, xpath_expr)
-- create view to lessen repetition of xml concatenation
CREATE OR REPLACE VIEW restaurants_xml
AS
SELECT *,
CONCAT('<r><c>', REPLACE(cuisine, ', ', '</c><c>'), '</c></r>') AS xml
FROM restaurants
;
-- generate rows using xml parsing
SELECT *
FROM (
SELECT `name`
, EXTRACTVALUE(xml, '//c[1]/text()') AS `cuisine`
FROM restaurants_xml
UNION
SELECT `name`
, EXTRACTVALUE(xml, '//c[2]/text()') AS `cuisine`
FROM restaurants_xml
UNION
SELECT `name`
, EXTRACTVALUE(xml, '//c[3]/text()') AS `cuisine`
FROM restaurants_xml
UNION
SELECT `name`
, EXTRACTVALUE(xml, '//c[4]/text()') AS `cuisine`
FROM restaurants_xml
UNION
SELECT `name`
, EXTRACTVALUE(xml, '//c[5]/text()') AS `cuisine`
FROM restaurants_xml
) derived
WHERE `cuisine` <> ''
;
SELECT EXTRACTVALUE(xml, '//c[$@rownum]/text()') AS `cuisine`
, COUNT(*) AS `count`
FROM util.numbers AS nums
INNER JOIN (
SELECT
CONCAT('<r><c>', REPLACE(cuisine, ', ', '</c><c>'), '</c></r>') AS xml
FROM restaurants
) AS tbl ON nums.n BETWEEN 1 AND EXTRACTVALUE(xml, 'count(//c)')
AND @rownum:=n
GROUP BY EXTRACTVALUE(xml, '//c[$@rownum]/text()')
ORDER BY `cuisine`
;
(results)
SELECT EXTRACTVALUE(xml, '//c[$@rownum]/text()') AS `cuisine`
, COUNT(*) AS `count`
, GROUP_CONCAT(`name` SEPARATOR ', ') AS `restaurants`
FROM util.numbers AS nums
INNER JOIN (
SELECT *,
CONCAT('<r><c>', REPLACE(cuisine, ', ', '</c><c>'), '</c></r>') AS xml
FROM restaurants
) AS tbl ON nums.n BETWEEN 1 AND EXTRACTVALUE(xml, 'count(//c)')
AND @rownum:=n
GROUP BY EXTRACTVALUE(xml, '//c[$@rownum]/text()')
ORDER BY cuisine
;
(results) [/step]
Have a question about something in this article? You can receive help directly from the article author. Sign up for a free trial to get started.
Comments (3)
Commented:
It's good to see you're getting some mileage out of the XML parsing trick. This is an excellent sequel. As usual, I've learned new things like Group_Concat()
Thank you for the https://www.experts-exchange.com/A_1536.html link. During the editing process, I asked about performance and the author ran a 150,000 item test. It performed quite well.
Author
Commented:And, good idea! I actually considered mentioning performance when I used the word efficient as in the context I was referring mainly to the efficiency gained by programmer in amount of code to type; however, that doesn't always translate well to query optimization and efficiency.
So let's look at some measures ...
Performance Metrics
From true performance perspective, of course the XML adds some overhead as a3 postulated; however, here is a look at how the methods above stack up.Parsing to columns took between 0.016 (duration) / 2.4333 (fetch) to 0.374 / 1.997 seconds operating on 150,005 rows. The low end is using the `restaurants_xml` view and the upper end the derived table.
Comparatively, the execution of union approach to generate rows and an ungrouped version of the numbers utility table based query, both using the view instead of derived tables, ran for 0.016 / 10.311 to ? / ? seconds (I stopped the UNION query after 2006.672 seconds of duration, so never got to fetch). Another test of my XML parsing trick yielded results in 21.169 / 3.822, which is slower than columns method; however, the UNION approach is a drastic slowdown, but very explainable, I guess, since we are exploding 150,000+ rows to 500,000+ in the first place then couple that with the multiple table scans.
Case and point, the scenario we have hear called for a grouping of the results by cuisine and when using the source code from section #3 that gets the cuisine and aggregate count(), the performance is 29.094 / 0.000. Not the greatest, but acceptable for me to get the answer ... Now, each environment and tolerance for lag may be different, so please keep this in mind.
As with anything, there are always consequences positive and negative to utilizing certain methodologies. I probably don't say it enough, but please always test and use EXPLAIN to determine what will work with your record set.
Thanks again, aikimark. This is what I love about Articles: the research and discussion.
Best regards,
Kevin
Author
Commented:A MySQL Tidbit: Dynamic Export To_XML Stored Procedure
The "Export" section of this new article shows how to use MySQL SQL code to generate CSV files and the overall dynamic procedure can be (or at least, should be) easily modified to export to CSV versus XML. Hope you like it ...