Solved

Using XSLT, how to extract a certain element from a file and discard the rest

Posted on 2014-11-13
7
161 Views
Last Modified: 2014-11-13
I have an XML file and I want to extract a particular element (eee) and any attributes it may have, and discard the rest of the file.
For example if my input XML file was this:

<?xml version="1.0"?>
<aaa>
      <bbb>
            <ccc>...</ccc>
            <ccc>...</ccc>
      </bbb>
      <ddd>
            <eee f="1" g="2"/>
            <eee f="3" g="4"/>
  </ddd>
</aaa>            

I would want to end up with this:
            <eee f="1" g="2"/>
            <eee f="3" g="4"/>
            
The format of the XML data may change and I will not necessarily know that but I will always only want the eee element and any attributes it may have.

Can anyone point this XSLT newbie in the right direction!
0
Comment
Question by:Letterpart
  • 4
  • 2
7 Comments
 
LVL 33

Assisted Solution

by:ste5an
ste5an earned 250 total points
ID: 40440490
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:fn="http://www.w3.org/2005/xpath-functions">
	<xsl:output method="html" version="1.0" encoding="UTF-8" indent="yes"/>
	<xsl:template match="/*">
		<html>
			<head>
				<title>Matches</title>
			</head>
			<body>
				<table border="1">
					<xsl:apply-templates select="//eee"/>
				</table>
			</body>
		</html>
	</xsl:template>
	<xsl:template match="eee">
		<tr>
			<td>
				<xsl:copy>
					<xsl:apply-templates select="@*|node()"/>
				</xsl:copy>
			</td>
		</tr>
	</xsl:template>
</xsl:stylesheet>

Open in new window

0
 
LVL 60

Accepted Solution

by:
Geert Bormans earned 250 total points
ID: 40440533
@ste5an, I don't think  html transformation is the requirement (and you are loosing attributes)

you can use copy-of

<xsl:template match="/">
<xsl:copy-of select="//eee"/>
</xsl:template>
0
 
LVL 60

Expert Comment

by:Geert Bormans
ID: 40440602
there are some things you need to take into account
serialisation should be XML
<xsl:output method="xml" version="1.0" />
and you might need to wrap a root element arround it
0
Master Your Team's Linux and Cloud Stack

Come see why top tech companies like Mailchimp and Media Temple use Linux Academy to build their employee training programs.

 
LVL 1

Author Closing Comment

by:Letterpart
ID: 40440719
Thanks ste5an and Geert. Both solutions worked for me (and outputted attributes).
I have split the points - hope that is OK with you.
ste5an was first but Geert's solution seems to be slightly neater in that there is one template instead of two.
Thanks for your help.
0
 
LVL 33

Expert Comment

by:ste5an
ID: 40440816
@Geert, had only IE at hand to do the transform 😉
0
 
LVL 60

Expert Comment

by:Geert Bormans
ID: 40440869
Hi,

First let me say that I don't mind at all that you split the points, I am OK with that

But I fail to see how ste5an's solution could possibly work
-  I am not aware of a single XSLT processor that accepts <xsl:output method="html" version="1.0" ...
you simply can't serialize html to version 1.0
- by doing <xsl:apply-templates select="@*" you push out the attributes and a built in template picks up the text nodes in it, so you will get
           <td>
               <eee>12</eee>
            </td>
in text mode, which is not
<eee f="1" g="2"/>
(héhé, Ste5an actually uses 3 built in (invisible) templates, one for '/4, one for '@*' and one for 'text()' but that is not bad at all :-)

Anyhow, if you are happy, I am, just wanted to make that point

Stating that my solution is slightly neater because it has only one template is a bit weird.
It is neater because it does what you asked,
but XSLT developers consider code neater when it has a good functional seperation... mostly code that has more templates is considered neater :-)
0
 
LVL 60

Expert Comment

by:Geert Bormans
ID: 40440875
@ste5an, I understand :-)
(aha, that one ignores the serialisation settings, hence no error on the version)
0

Featured Post

DevOps Toolchain Recommendations

Read this Gartner Research Note and discover how your IT organization can automate and optimize DevOps processes using a toolchain architecture.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
DOCTYPE in pubmed 5 67
Building a string from an xml 6 99
tag a picture in Word 2013? 6 62
Fetch XML Unions? 3 577
Browsing the questions asked to the Experts of this forum, you will be amazed to see how many times people are headaching about monster regular expressions (regex) to select that specific part of some HTML or XML file they want to extract. The examp…
The Confluence of Individual Knowledge and the Collective Intelligence At this writing (summer 2013) the term API (http://dictionary.reference.com/browse/API?s=t) has made its way into the popular lexicon of the English language.  A few years ago, …
Microsoft Active Directory, the widely used IT infrastructure, is known for its high risk of credential theft. The best way to test your Active Directory’s vulnerabilities to pass-the-ticket, pass-the-hash, privilege escalation, and malware attacks …
A short tutorial showing how to set up an email signature in Outlook on the Web (previously known as OWA). For free email signatures designs, visit https://www.mail-signatures.com/articles/signature-templates/?sts=6651 If you want to manage em…

786 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question