SiJP
asked on
Remove Empty Elements
In theory, I have this XML file (and be gentle, I'm not experienced with XML or XSLT).
<root>
<data>
<date>
<start/>
<end></end>
</date>
<time period="start">12:00</time >
<time period="end">13:00</time>
</data>
</root>
As you can see, there are a few elements that have no data. What I'd like to do is clean up this XML with an XSLT, by getting rid of any elements that have no attributes or values.
Therefore the outputted xml would be:
<root>
<data>
<time period="start">12:00</time >
<time period="end">13:00</time>
</data>
</root>
Ideally, the <start/> element and <end></end> element would be ripped out. And because the <date>.. element would subsequently have no child nodes or data, this would go to.
I've had one example given to me (below) but this not only removes all the blank elements, but also the attributes of the elements that do have values.
<xsl:template match="*[not(node())]"/>
<xsl:template match="node()">
<xsl:copy>
<xsl:apply-templates select="node()"/>
</xsl:copy>
</xsl:template>
If this make's sense, how do I remove all empty elements ('simple' and 'complex', I guess) that have no values whatsoever?
Thanks
p.s - sorry about the lacking in points - it's all I have available!
<root>
<data>
<date>
<start/>
<end></end>
</date>
<time period="start">12:00</time
<time period="end">13:00</time>
</data>
</root>
As you can see, there are a few elements that have no data. What I'd like to do is clean up this XML with an XSLT, by getting rid of any elements that have no attributes or values.
Therefore the outputted xml would be:
<root>
<data>
<time period="start">12:00</time
<time period="end">13:00</time>
</data>
</root>
Ideally, the <start/> element and <end></end> element would be ripped out. And because the <date>.. element would subsequently have no child nodes or data, this would go to.
I've had one example given to me (below) but this not only removes all the blank elements, but also the attributes of the elements that do have values.
<xsl:template match="*[not(node())]"/>
<xsl:template match="node()">
<xsl:copy>
<xsl:apply-templates select="node()"/>
</xsl:copy>
</xsl:template>
If this make's sense, how do I remove all empty elements ('simple' and 'complex', I guess) that have no values whatsoever?
Thanks
p.s - sorry about the lacking in points - it's all I have available!
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
words on the script
----------------------
The heart of this piece of script is the XPath in the match of the first template.
That XPath matches any node that matches one of the following rules
- somewhere inside (descendant-or-self) there is an element with an attribute
- somewhere inside (descendant-or-self) there is an element with content that has a length of 1 or more, after white-space normalisation
I ignore all the other elements
The other two templates tell me that I can safely copy all text-nodes and all attributes
works also in this bordercase:
<?xml version="1.0" encoding="UTF-8"?>
<root><smooth><empty><seri es></serie s></empty> </smooth>< /root>
What do you do yourself here?
-------------------------- --------
- I have not tested the behaviour with comments and PIs, you might need some twiddling
- I have not bothered about avoiding the creation of spurious white-space nodes
Well, you should have some fun yourself, no?
Good luck
Gertone
----------------------
The heart of this piece of script is the XPath in the match of the first template.
That XPath matches any node that matches one of the following rules
- somewhere inside (descendant-or-self) there is an element with an attribute
- somewhere inside (descendant-or-self) there is an element with content that has a length of 1 or more, after white-space normalisation
I ignore all the other elements
The other two templates tell me that I can safely copy all text-nodes and all attributes
works also in this bordercase:
<?xml version="1.0" encoding="UTF-8"?>
<root><smooth><empty><seri
What do you do yourself here?
--------------------------
- I have not tested the behaviour with comments and PIs, you might need some twiddling
- I have not bothered about avoiding the creation of spurious white-space nodes
Well, you should have some fun yourself, no?
Good luck
Gertone
Well,
could not resist testing with PI and comments and it ignores them as expected
<empty><!-- my comment --></empty> is still an empty element...
so if that is what you want, you can leave it like that
If you add the strip-space element like this at the beginning
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:strip-space elements="*"/>
<xsl:template
match="node()[ descendant-or-self::*[@*] or descendant-or-self::*[stri ng-length( normalize- space(.)) > 0]]">
<xsl:copy>
all "pretty-print" white-space nodes are removed, you will get your XML on one line, which you can "pretty-print" again with XML-Spy or Oxygen.
One WARNING though, if you do that:
"<test>some<m>mixed </m>content and some mixed white-space<m> </m> </test>"
The second element <m> contains only a white-space (in mixed content) it will be removed since the white-space only node will be removed
so you will loose that.
One can argue that the white-space node after the second (mixed-content) element <m> is not to be thrown away, but it is removed as well,
as in "<test>word <m>bold</m> <m>italic</m></test>" you will loose the space between the two (mixed content) elements <m>
Bottomline: If you have a lot of mixed content, be careful with the strip-space,
or include mixed-content elements in an <xsl:preserve-space elements="test firstMixed m"/> if you know who they are.
And now I should shut up :-)
could not resist testing with PI and comments and it ignores them as expected
<empty><!-- my comment --></empty> is still an empty element...
so if that is what you want, you can leave it like that
If you add the strip-space element like this at the beginning
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:strip-space elements="*"/>
<xsl:template
match="node()[ descendant-or-self::*[@*] or descendant-or-self::*[stri
<xsl:copy>
all "pretty-print" white-space nodes are removed, you will get your XML on one line, which you can "pretty-print" again with XML-Spy or Oxygen.
One WARNING though, if you do that:
"<test>some<m>mixed </m>content and some mixed white-space<m> </m> </test>"
The second element <m> contains only a white-space (in mixed content) it will be removed since the white-space only node will be removed
so you will loose that.
One can argue that the white-space node after the second (mixed-content) element <m> is not to be thrown away, but it is removed as well,
as in "<test>word <m>bold</m> <m>italic</m></test>" you will loose the space between the two (mixed content) elements <m>
Bottomline: If you have a lot of mixed content, be careful with the strip-space,
or include mixed-content elements in an <xsl:preserve-space elements="test firstMixed m"/> if you know who they are.
And now I should shut up :-)
ASKER
Ah Gertone, no need to shut up by any means - this is excellant information! When I get back in to me office, I shall be having fun with your xslt!
ASKER
Gertone,
What script changes would be needed to also exclude elements like:
<SomeElement myAtt=""></SomeElement>
(e.g. elemts that have a blank attribute as well as no data)?
Thanks!
What script changes would be needed to also exclude elements like:
<SomeElement myAtt=""></SomeElement>
(e.g. elemts that have a blank attribute as well as no data)?
Thanks!
ASKER
Oh, and also...
<SomeElement myAtt=""/>
:)
<SomeElement myAtt=""/>
:)
You are a real challenger hé!? :-)
Just change the XPath in
match="node()[ descendant-or-self::*[@* != ''] or descendant-or-self::*[stri ng-length( normalize- space(.)) > 0]]"
and it should work
Just change the XPath in
match="node()[ descendant-or-self::*[@* != ''] or descendant-or-self::*[stri
and it should work
It might not be clear from the copy,
but I changed the [@*] into [@* != '']
and that is 2 single quotes, not one double quote
So I check that any attribute is not the empty string, and in the meantime it gives a false for none existing attributes as well
but I changed the [@*] into [@* != '']
and that is 2 single quotes, not one double quote
So I check that any attribute is not the empty string, and in the meantime it gives a false for none existing attributes as well
ASKER
Clearly not a challenge that is too much for you!
Thank you Gertone.. I will try this again and let you know of my results
Si
Thank you Gertone.. I will try this again and let you know of my results
Si
ASKER
My friend, this has helped me enormously, thank you!
Final XSLT:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output method="xml" encoding="UTF-8" indent="yes"/>
<xsl:template
match="node()[ descendant-or-self::*[@* != ''] or descendant-or-self::*[stri ng-length( normalize- space(.)) > 0]]">
<xsl:copy>
<xsl:apply-templates select="@*"/>
<xsl:apply-templates select="node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="text()">
<xsl:value-of select="."/>
</xsl:template>
<xsl:template match="@*">
<xsl:attribute name="{name()}"><xsl:value -of select="."/>
</xsl:attribute>
</xsl:template>
</xsl:stylesheet>
Final XSLT:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output method="xml" encoding="UTF-8" indent="yes"/>
<xsl:template
match="node()[ descendant-or-self::*[@* != ''] or descendant-or-self::*[stri
<xsl:copy>
<xsl:apply-templates select="@*"/>
<xsl:apply-templates select="node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="text()">
<xsl:value-of select="."/>
</xsl:template>
<xsl:template match="@*">
<xsl:attribute name="{name()}"><xsl:value
</xsl:attribute>
</xsl:template>
</xsl:stylesheet>
You are welcome
OR SUBNODES!
Start with the identity transformation:
<xsl:template match="node()|@*">
<xsl:copy>
<xsl:apply-templates select="@*"/>
<xsl:apply-templates/>
</xsl:copy>
</xsl:template>
and add an xsl:if around the copy such that if all three conditions are true (no attributes/values/subnodes
xsl: if example: <xsl:if test="(@author = 'bd') or (@year='1667')">