Declan Basile
asked on
Recursive XML nodes query in SQL Server
Using SQL Server 2012, I need to convert XML to a string such that, for each <PId> element at any level, the Element names of all the parent elements are concatenated in order (highest to lowest level nodes), delimited by a ".", and the value of the <PId> element is then concatenated to that string.
For example, this XML:
'<P><PId>165</PId><PId>170</PId><P98><PId>5</PId></P98><P99><P100><PId>5</PId></P100></P99></P>'
Becomes 4 records for the 4 <PId> elements:
P.165
P.170
P.P98.5
P.P99.P100.5
If I use something like
SELECT T.C.query('..')
FROM @XML.nodes('//PId') T(C)
in a CTE recursive statement, I believe query('..') gives the parent element, not a copy of the original XML with the context node set to the parent element, so I loose visibility of the parent's parent element.
How can I accomplish this? Note: It's a given that the highest level node with always be <P>, and that every node between <P> and <PId> with be <P + some number>.
For example, this XML:
'<P><PId>165</PId><PId>170</PId><P98><PId>5</PId></P98><P99><P100><PId>5</PId></P100></P99></P>'
Becomes 4 records for the 4 <PId> elements:
P.165
P.170
P.P98.5
P.P99.P100.5
If I use something like
SELECT T.C.query('..')
FROM @XML.nodes('//PId') T(C)
in a CTE recursive statement, I believe query('..') gives the parent element, not a copy of the original XML with the context node set to the parent element, so I loose visibility of the parent's parent element.
How can I accomplish this? Note: It's a given that the highest level node with always be <P>, and that every node between <P> and <PId> with be <P + some number>.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Thanks. I ended up doing something similar to what you posted. The key idea that I thought of was to write code that exhaustively creates a record for each element including all it's parent elements and then filtering out the ones that don't end in "<PId>". Instead of starting with the child elements and traversing upwards I started with the parent elements and traversed downward, filtering out unwanted records in the end. This is what it looks like (note: I added a few more things than mentioned in my original question. I wanted to keep the question as simple as possible.)
WITH XYZ (X, Y)
AS
(
SELECT T.C.query('.') As X,
CONVERT(varchar(MAX), T.C.query('if (fn:local-name(.) = "PId") then (data(.)) else fn:local-name(.)')) As Y
FROM @XML_R.nodes('/P/*') T(C)
UNION ALL
SELECT T.C.query('.') As X,
Z.Y + '.' + CONVERT(varchar(MAX), T.C.query('if (fn:local-name(.) = "PId") then (data(.)) else fn:local-name(.)')) As Y
FROM XYZ Z CROSS APPLY X.nodes('./*/*') T(C)
)
SELECT @RPFields = STUFF((SELECT ', P' + REPLACE(Y, 'P', '') FROM XYZ WHERE RIGHT(CONVERT(varchar(MAX), X), 6) = '</PId>' FOR XML PATH('')), 1, 2, '')
Cast(textfield as XML)
or you have a XML datetype and want covert this?
Cast(xmlfield as nvarchar(xx)) or any other text type...