?
Solved

Extracting subset of XML data using XPATH

Posted on 2007-04-11
7
Medium Priority
?
340 Views
Last Modified: 2009-12-16
Hi all,

I've been giving myself a crash course on XPath, but I'm still having some difficulty constructing some XPath expressions to best fit my needs -- assuming that it can be done.

I'm working on a project that is a database of different document types for various parts/products. The structure of the XML is...

<family ID="famID" label="">
      <group ID="grpID" label="">
            <part ID="ABCDE" type="" partNo="" description="">
                  <document type="A" label="" file="ABCDE-A.pdf"/>
                  <document type="B" label="" file=""/>
                  <document type="C" label="" file=""/>
                  <document type="D" label="" file="ABCDE-docD.pdf"/>
                  <document type="E" label="" file="installation.pdf"/>
                  <document type="F" label="" file=""/>
                  <document type="G" label="" file=""/>
            </part>
            <part ID="XYZ" type="" partNo="" description="">
                  <document type="A" label="" file="XYzee.pdf"/>
                  <document type="B" label="" file=""/>
                  <document type="C" label="" file=""/>
                  <document type="D" label="" file="ex2why.doc"/>
                  <document type="E" label="" file="installation.pdf"/>
                  <document type="F" label="" file=""/>
                  <document type="G" label="" file=""/>
            </part>
      </group>
</family>      

There is no text in the elements. All data is presented as values of attributes.

What I am trying to achieve is generating a new XML object based on specific ID criteria and file attributes of document elements not being null.

For example, if I need all existing documents (file != "") of parts that have "CD" in their ID, the output XML object should be:

<family ID="" label="">
      <group ID="" label="">
            <part ID="ABCDE" type="" partNo="" description="">
                  <document type="A" label="" file="ABCDE-A.pdf"/>
                  <document type="D" label="" file="ABCDE-docD.pdf"/>
                  <document type="E" label="" file="installation.pdf"/>
            </part>
      </group>
</family>      

I can select the part nodes based on criteria by using -- //part [contains(@ID,\'CD\')] -- which spits back an array of found <part> elements and included child nodes. And, I can select documents where the file attribute is not null by using -- //document [@file != ''] -- which spits out the <document> elements. But, is it possible to get back the full hierarchy based on selection criteria without using XSLT?

...Rob
0
Comment
Question by:HairyDogDigital
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 4
  • 3
7 Comments
 
LVL 60

Expert Comment

by:Geert Bormans
ID: 18887968
> But, is it possible to get back the full hierarchy based on selection criteria without using XSLT?

no, you can't (well you can use other programming techniques such as DOM)
but XPath is meant for selection (addressing) of nodes
If you need to recreate a slimmed down version of your XML,
you need to fit your XPath in an XSLT or XQuery

cheers

Geert
0
 
LVL 1

Author Comment

by:HairyDogDigital
ID: 18888097
Okay, so it won't work with just XPath. Not sure if XSLT or XQuery is an option, because the XPATH implementation is in Flash. That does leave me the possible option of working through it via the DOM.

However, is it possible to select just an ELEMENT? Getting back to my example, if I pull the PART elements that I need, can I get JUST the GROUP element for a part, without having all of the child/descendant nodes in tow with it?

...Rob
0
 
LVL 60

Accepted Solution

by:
Geert Bormans earned 2000 total points
ID: 18888116
> can I get JUST the GROUP element for a part, without having all of the child/descendant nodes in tow with it?

no, still XPath is only for selecting the node
I recommend to check if XSLT is not an option.
It likely is for most browsers
working through it with the DOM will be more of a hassle than doing it in XSLT

cheers

Geert
0
Interactive Way of Training for the AWS CSA Exam

An interactive way of learning that will help you visualize core concepts so that you can be more effective when taking your AWS certification exam.  Built for students by a student to help them understand the concepts that they are being taught.

 
LVL 1

Author Comment

by:HairyDogDigital
ID: 18888300
Bummer!

I don't know of any XSLT extensions for Flash, which is unfortunate because the "end" result is actually an XHTML document (as a string) that is loaded into an embedded text field. So XSLT would be ideal.

Since the XML has a very tight structure and does not have any text nodes (only element and attribute nodes), it's not that difficult to read parent/child/sibling node names and attribute name/values to created the XHTML.

Yes, more of a hassle than XSLT. And before I go the DOM route, I'm going to search further for an XSLT implementation for Flash. Since there is a fairly robust XPath and XPath 2 implementation that an XSLT should also exist.

Thanks for the quick response. At least I know that I'm not going to get what I need purely from XPath.

...Rob
0
 
LVL 60

Expert Comment

by:Geert Bormans
ID: 18888351
I wasn't aware there is an XPath2 implementation
I would be very surprised if there isn't an XSLT1 then

cheers

0
 
LVL 1

Author Comment

by:HairyDogDigital
ID: 18891648
This is a "good news" / "bad news" thing.

First, the XPath implementation for Flash is a third-party set of classes. The good news is that it exists, though I might have read a specification incorrectly as to whether it is XPath or XPath 2.
 
The bad news, no XSLT for Flash... as of yet.

Fortunately, the amount of data I am dealing with does not cause a remarkable performance hit when traversing parent, grandparent, and child nodes to generate the desired output.

Again, thanks for the quick response. You saved me hours of plunking around on Google!

...Rob
0
 
LVL 60

Expert Comment

by:Geert Bormans
ID: 18892623
welcome
sorry for the bad news

Geert
0

Featured Post

Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

The Client Need Led Us to RSS I recently had an investment company ask me how they might notify their constituents about their newsworthy publications.  Probably you would think "Facebook" or "Twitter" but this is an interesting client.  Their cons…
I was working on a PowerPoint add-in the other day and a client asked me "can you implement a feature which processes a chart when it's pasted into a slide from another deck?". It got me wondering how to hook into built-in ribbon events in Office.
Michael from AdRem Software explains how to view the most utilized and worst performing nodes in your network, by accessing the Top Charts view in NetCrunch network monitor (https://www.adremsoft.com/). Top Charts is a view in which you can set seve…
Visualize your data even better in Access queries. Given a date and a value, this lesson shows how to compare that value with the previous value, calculate the difference, and display a circle if the value is the same, an up triangle if it increased…
Suggested Courses
Course of the Month10 days, 10 hours left to enroll

765 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question