Solved

Extracting subset of XML data using XPATH

Posted on 2007-04-11
7
305 Views
Last Modified: 2009-12-16
Hi all,

I've been giving myself a crash course on XPath, but I'm still having some difficulty constructing some XPath expressions to best fit my needs -- assuming that it can be done.

I'm working on a project that is a database of different document types for various parts/products. The structure of the XML is...

<family ID="famID" label="">
      <group ID="grpID" label="">
            <part ID="ABCDE" type="" partNo="" description="">
                  <document type="A" label="" file="ABCDE-A.pdf"/>
                  <document type="B" label="" file=""/>
                  <document type="C" label="" file=""/>
                  <document type="D" label="" file="ABCDE-docD.pdf"/>
                  <document type="E" label="" file="installation.pdf"/>
                  <document type="F" label="" file=""/>
                  <document type="G" label="" file=""/>
            </part>
            <part ID="XYZ" type="" partNo="" description="">
                  <document type="A" label="" file="XYzee.pdf"/>
                  <document type="B" label="" file=""/>
                  <document type="C" label="" file=""/>
                  <document type="D" label="" file="ex2why.doc"/>
                  <document type="E" label="" file="installation.pdf"/>
                  <document type="F" label="" file=""/>
                  <document type="G" label="" file=""/>
            </part>
      </group>
</family>      

There is no text in the elements. All data is presented as values of attributes.

What I am trying to achieve is generating a new XML object based on specific ID criteria and file attributes of document elements not being null.

For example, if I need all existing documents (file != "") of parts that have "CD" in their ID, the output XML object should be:

<family ID="" label="">
      <group ID="" label="">
            <part ID="ABCDE" type="" partNo="" description="">
                  <document type="A" label="" file="ABCDE-A.pdf"/>
                  <document type="D" label="" file="ABCDE-docD.pdf"/>
                  <document type="E" label="" file="installation.pdf"/>
            </part>
      </group>
</family>      

I can select the part nodes based on criteria by using -- //part [contains(@ID,\'CD\')] -- which spits back an array of found <part> elements and included child nodes. And, I can select documents where the file attribute is not null by using -- //document [@file != ''] -- which spits out the <document> elements. But, is it possible to get back the full hierarchy based on selection criteria without using XSLT?

...Rob
0
Comment
Question by:HairyDogDigital
  • 4
  • 3
7 Comments
 
LVL 60

Expert Comment

by:Geert Bormans
ID: 18887968
> But, is it possible to get back the full hierarchy based on selection criteria without using XSLT?

no, you can't (well you can use other programming techniques such as DOM)
but XPath is meant for selection (addressing) of nodes
If you need to recreate a slimmed down version of your XML,
you need to fit your XPath in an XSLT or XQuery

cheers

Geert
0
 
LVL 1

Author Comment

by:HairyDogDigital
ID: 18888097
Okay, so it won't work with just XPath. Not sure if XSLT or XQuery is an option, because the XPATH implementation is in Flash. That does leave me the possible option of working through it via the DOM.

However, is it possible to select just an ELEMENT? Getting back to my example, if I pull the PART elements that I need, can I get JUST the GROUP element for a part, without having all of the child/descendant nodes in tow with it?

...Rob
0
 
LVL 60

Accepted Solution

by:
Geert Bormans earned 500 total points
ID: 18888116
> can I get JUST the GROUP element for a part, without having all of the child/descendant nodes in tow with it?

no, still XPath is only for selecting the node
I recommend to check if XSLT is not an option.
It likely is for most browsers
working through it with the DOM will be more of a hassle than doing it in XSLT

cheers

Geert
0
DevOps Toolchain Recommendations

Read this Gartner Research Note and discover how your IT organization can automate and optimize DevOps processes using a toolchain architecture.

 
LVL 1

Author Comment

by:HairyDogDigital
ID: 18888300
Bummer!

I don't know of any XSLT extensions for Flash, which is unfortunate because the "end" result is actually an XHTML document (as a string) that is loaded into an embedded text field. So XSLT would be ideal.

Since the XML has a very tight structure and does not have any text nodes (only element and attribute nodes), it's not that difficult to read parent/child/sibling node names and attribute name/values to created the XHTML.

Yes, more of a hassle than XSLT. And before I go the DOM route, I'm going to search further for an XSLT implementation for Flash. Since there is a fairly robust XPath and XPath 2 implementation that an XSLT should also exist.

Thanks for the quick response. At least I know that I'm not going to get what I need purely from XPath.

...Rob
0
 
LVL 60

Expert Comment

by:Geert Bormans
ID: 18888351
I wasn't aware there is an XPath2 implementation
I would be very surprised if there isn't an XSLT1 then

cheers

0
 
LVL 1

Author Comment

by:HairyDogDigital
ID: 18891648
This is a "good news" / "bad news" thing.

First, the XPath implementation for Flash is a third-party set of classes. The good news is that it exists, though I might have read a specification incorrectly as to whether it is XPath or XPath 2.
 
The bad news, no XSLT for Flash... as of yet.

Fortunately, the amount of data I am dealing with does not cause a remarkable performance hit when traversing parent, grandparent, and child nodes to generate the desired output.

Again, thanks for the quick response. You saved me hours of plunking around on Google!

...Rob
0
 
LVL 60

Expert Comment

by:Geert Bormans
ID: 18892623
welcome
sorry for the bad news

Geert
0

Featured Post

Master Your Team's Linux and Cloud Stack

Come see why top tech companies like Mailchimp and Media Temple use Linux Academy to build their employee training programs.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
read PayflowPro reports using the report metadata using perl 11 42
json format text only 4 82
Parsing the XML data to SQL Server 4 65
Insert Powershell variable into XML 4 45
The Problem How to write an Xquery that works like a SQL outer join, providing placeholders for absent data on the outer side?  I give a bit more background at the end. The situation expressed as relational data Let’s work through this.  I’ve …
The Confluence of Individual Knowledge and the Collective Intelligence At this writing (summer 2013) the term API (http://dictionary.reference.com/browse/API?s=t) has made its way into the popular lexicon of the English language.  A few years ago, …
A short tutorial showing how to set up an email signature in Outlook on the Web (previously known as OWA). For free email signatures designs, visit https://www.mail-signatures.com/articles/signature-templates/?sts=6651 If you want to manage em…
Finds all prime numbers in a range requested and places them in a public primes() array. I've demostrated a template size of 30 (2 * 3 * 5) but larger templates can be built such 210  (2 * 3 * 5 * 7) or 2310  (2 * 3 * 5 * 7 * 11). The larger templa…

776 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question