Link to home
Start Free TrialLog in
Avatar of dazkraz
dazkraz

asked on

For XML PATH Querying

We have a table with three columns "ID" int, "ParentID" int (foreign key to itself), "Data" XML type.

This data is contained in the table:
1,  null,  "<kitchenware><name>Kitchenware</name></kitchenware>"
2,  1,  "<product><name>Knife</name></product>"
3,  1,  "<product><name>SuperKnife</name></product>"
4,  3,  "<product><name>SuperKnife Replacement Blade</name></product>"

Open in new window


How do I being the data back using SQL Server 2008 XML to look like:
<kitchenware>
  <name>Kitchenware</name>
  <product>
    <name>Knife</name>
  </product>
  <product>
    <name>SuperKnife</name>
    <product>
      <name>SuperKnife Replacement Blade</name>
    </product>
  </product>
</kitchenware>

Open in new window

Avatar of radcaesar
radcaesar
Flag of India image

Howdy

I believe the following should do what you are looking for.

Hope this helps
James
-- We will use this function to recurse down our list
Create FUNCTION ProductList(@parentID int)
RETURNS XML
WITH RETURNS NULL ON NULL INPUT 
BEGIN RETURN 
  (SELECT 
	product.value('(//name)[1]','varchar(20)') as "name",
	CASE WHEN parentid=@parentID
		THEN dbo.ProductList(id)
     END
   FROM dbo.products WHERE parentid=@parentID
   FOR XML PATH('products'), TYPE)
END
-- this generates the xml as per your sample
select 
	product.value('(//name)[1]','varchar(20)') as "name",
	dbo.ProductList(id) 
from products
where id = 1
FOR XML PATH('kitchenware'), TYPE

Open in new window

Avatar of dazkraz
dazkraz

ASKER

James, your solution would work, but just two things:

1. The word "kitchenware" is variable (as it's XML data in the column). It can be anything but on line 20 you hardcoded it?
2. The inner elements have many (not just 'name'). Any they can also be anything, and may or may not include the 'name' element. But line 7 you hardcoded it too.
Sorry I just followed the example you specified, if the root element is variable and the elements are variable what is there that is a known quantity to manipulate the xml.

If there is a known point we could look at doing an xml node insert in a loop something utilising

product.modify('
  insert @subNode
  after(/product/name)[1]')

however as you can see we need to know the name of a node to do this. If you require further help could you please provide the full rule set for merging these xml nodes.

James
ASKER CERTIFIED SOLUTION
Avatar of cyberkiwi
cyberkiwi
Flag of New Zealand image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of dazkraz

ASKER

cyberkiwi, your method works. But when I run it on the actual table, I get the error:

The statement terminated. The maximum recursion 100 has been exhausted before statement completion.

And it takes a long time. There are around 2000 records scattered around 5 levels deep in the real data I'm running it on.

My workaround (using XSLT to do the nesting) completes it in around 1.5 seconds. The SQL approach gives that error after around 10 seconds so I think there's something I'm doing wrong there.
The statement terminated. The maximum recursion 100 has been exhausted before statement completion.

Open in new window

The long and the short of it is that SQL Server 2008 (worse in 2005) has poor support for XML.  It makes a show of it, but is not nearly useful enough for complex operations.

XQuery will probably get further than xml.modify, but there is only so much you can do.

I saw it as a challenge to do it in pure-TSQL in a single statement (aside from the helper function), but if you are using this for any volume of data on a production system, I would be recommending CLR stored procedures (again, if it has to stay on SQL Server).

What the code actually does, especially when CTE is involved is make a NEW copy of the XML with just the one modification each time through the CTE (therefore every 1 record), and keeping it in memory.  Only after the last query (I believe) "select * from cte3 where parent is null" is run does it release the memory for all those intermediate XML structures created.
And it takes a long time.
The SQL approach gives that error after around 10 seconds so I think there's something I'm doing wrong there.

Does a long time mean >10s (therefore >1.5s for XSLT), given the 2nd statement?
Or have you already added "option (maxrecursion 10000)" to the end of the query to see how long it would take if given the chance to finish?
Avatar of dazkraz

ASKER

What I did do was test both them with data of under 100 items.

The XSLT recorded time was 10ms average.
The same data using your method took 800ms average.

I might look into CLR stored procs then.

Do you think there is any performance benefit doing them in CLR stored procs (just XSLT transform) vs pulling the data to ASP.NET and doing it there?

SQL10 and IIS are on the same physical machine (different virtual machines).
Normally, I would favour doing things on the SQL Server, but at the end of the day, the same ops are being performed:
(1)collect xml from records
(2)xslt on records
(3)transfer xml via virtual network sql-> web

Now since no data is added/removed, it is the same size regardless of whether (3) deals with the final form or raw xml records.
To do it on the ASP.Net side, the order is (1), (3) as fragments, (2)

This version merges 2000 records into 2 root nodes in 1-2 seconds.  Instead of using a helper function (that only needs the query to wrap using __ID__), it is completely hard-coded to a particular table ("tmp").  Less flexible, but much faster.
FYI - for comparison using the accepted solution on this test data took 6:43.

SQL Server surprises me every now and then, sometimes in good ways. :)
if OBJECT_ID('dbo.getXmlFromRoot', 'FN') is not null
drop function dbo.getXmlFromRoot
GO
create function dbo.getXmlFromRoot(@parent int)
returns xml
begin
declare @xml xml
select @xml = convert(nvarchar(max),node) + isnull(
	(select dbo.getXmlFromRoot(id) [*]
	from tmp t
	where t.parent=tmp.id
	for xml path(''), root('Child')),'')
from tmp
where id=@parent

set @xml.modify('insert /Child/* as last into (/*)[1]')
set @xml.modify('delete /Child')
return @xml
end
GO

select dbo.getXmlFromRoot(id)
from tmp
where parent is null

Open in new window

Q-26520468.zip
Avatar of dazkraz

ASKER

cyberkiwi Thanks for your help, looks like we're using XSLT for this.