Solved

Using XSLT, how to extract a certain element from a file and discard the rest

Posted on 2014-11-13
7
160 Views
Last Modified: 2014-11-13
I have an XML file and I want to extract a particular element (eee) and any attributes it may have, and discard the rest of the file.
For example if my input XML file was this:

<?xml version="1.0"?>
<aaa>
      <bbb>
            <ccc>...</ccc>
            <ccc>...</ccc>
      </bbb>
      <ddd>
            <eee f="1" g="2"/>
            <eee f="3" g="4"/>
  </ddd>
</aaa>            

I would want to end up with this:
            <eee f="1" g="2"/>
            <eee f="3" g="4"/>
            
The format of the XML data may change and I will not necessarily know that but I will always only want the eee element and any attributes it may have.

Can anyone point this XSLT newbie in the right direction!
0
Comment
Question by:Letterpart
  • 4
  • 2
7 Comments
 
LVL 33

Assisted Solution

by:ste5an
ste5an earned 250 total points
ID: 40440490
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:fn="http://www.w3.org/2005/xpath-functions">
	<xsl:output method="html" version="1.0" encoding="UTF-8" indent="yes"/>
	<xsl:template match="/*">
		<html>
			<head>
				<title>Matches</title>
			</head>
			<body>
				<table border="1">
					<xsl:apply-templates select="//eee"/>
				</table>
			</body>
		</html>
	</xsl:template>
	<xsl:template match="eee">
		<tr>
			<td>
				<xsl:copy>
					<xsl:apply-templates select="@*|node()"/>
				</xsl:copy>
			</td>
		</tr>
	</xsl:template>
</xsl:stylesheet>

Open in new window

0
 
LVL 60

Accepted Solution

by:
Geert Bormans earned 250 total points
ID: 40440533
@ste5an, I don't think  html transformation is the requirement (and you are loosing attributes)

you can use copy-of

<xsl:template match="/">
<xsl:copy-of select="//eee"/>
</xsl:template>
0
 
LVL 60

Expert Comment

by:Geert Bormans
ID: 40440602
there are some things you need to take into account
serialisation should be XML
<xsl:output method="xml" version="1.0" />
and you might need to wrap a root element arround it
0
3 Use Cases for Connected Systems

Our Dev teams are like yours. They’re continually cranking out code for new features/bugs fixes, testing, deploying, testing some more, responding to production monitoring events and more. It’s complex. So, we thought you’d like to see what’s working for us.

 
LVL 1

Author Closing Comment

by:Letterpart
ID: 40440719
Thanks ste5an and Geert. Both solutions worked for me (and outputted attributes).
I have split the points - hope that is OK with you.
ste5an was first but Geert's solution seems to be slightly neater in that there is one template instead of two.
Thanks for your help.
0
 
LVL 33

Expert Comment

by:ste5an
ID: 40440816
@Geert, had only IE at hand to do the transform 😉
0
 
LVL 60

Expert Comment

by:Geert Bormans
ID: 40440869
Hi,

First let me say that I don't mind at all that you split the points, I am OK with that

But I fail to see how ste5an's solution could possibly work
-  I am not aware of a single XSLT processor that accepts <xsl:output method="html" version="1.0" ...
you simply can't serialize html to version 1.0
- by doing <xsl:apply-templates select="@*" you push out the attributes and a built in template picks up the text nodes in it, so you will get
           <td>
               <eee>12</eee>
            </td>
in text mode, which is not
<eee f="1" g="2"/>
(héhé, Ste5an actually uses 3 built in (invisible) templates, one for '/4, one for '@*' and one for 'text()' but that is not bad at all :-)

Anyhow, if you are happy, I am, just wanted to make that point

Stating that my solution is slightly neater because it has only one template is a bit weird.
It is neater because it does what you asked,
but XSLT developers consider code neater when it has a good functional seperation... mostly code that has more templates is considered neater :-)
0
 
LVL 60

Expert Comment

by:Geert Bormans
ID: 40440875
@ste5an, I understand :-)
(aha, that one ignores the serialisation settings, hence no error on the version)
0

Featured Post

Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Browsing the questions asked to the Experts of this forum, you will be amazed to see how many times people are headaching about monster regular expressions (regex) to select that specific part of some HTML or XML file they want to extract. The examp…
Many times as a report developer I've been asked to display normalized data such as three rows with values Jack, Joe, and Bob as a single comma-separated string such as 'Jack, Joe, Bob', and vice versa.  Here's how to do it. 
This tutorial gives a high-level tour of the interface of Marketo (a marketing automation tool to help businesses track and engage prospective customers and drive them to purchase). You will see the main areas including Marketing Activities, Design …
Both in life and business – not all partnerships are created equal. As the demand for cloud services increases, so do the number of self-proclaimed cloud partners. Asking the right questions up front in the partnership, will enable both parties …

912 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

16 Experts available now in Live!

Get 1:1 Help Now