Remove Empty Elements

In theory, I have this XML file (and be gentle, I'm not experienced with XML or XSLT).

<root>
   <data>
      <date>
         <start/>
         <end></end>
      </date>
      <time period="start">12:00</time>
      <time period="end">13:00</time>
   </data>
</root>

As you can see, there are a few elements that have no data.  What I'd like to do is clean up this XML with an XSLT, by getting rid of any elements that have no attributes or values.

Therefore the outputted xml would be:
<root>
   <data>
      <time period="start">12:00</time>
      <time period="end">13:00</time>
   </data>
</root>

Ideally, the <start/> element and <end></end> element would be ripped out.  And because the <date>.. element would subsequently have no child nodes or data, this would go to.

I've had one example given to me (below) but this not only removes all the blank elements, but also the attributes of the elements that do have values.

<xsl:template match="*[not(node())]"/>
<xsl:template match="node()">
  <xsl:copy>
    <xsl:apply-templates select="node()"/>
  </xsl:copy>
</xsl:template>

If this make's sense, how do I remove all empty elements ('simple' and 'complex', I guess) that have no values whatsoever?

Thanks


p.s - sorry about the lacking in points - it's all I have available!
LVL 3
SiJPAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

BobSiemensCommented:
"As you can see, there are a few elements that have no data.  What I'd like to do is clean up this XML with an XSLT, by getting rid of any elements that have no attributes or values."

OR SUBNODES!

Start with the identity transformation:

<xsl:template match="node()|@*">
   <xsl:copy>
   <xsl:apply-templates select="@*"/>
   <xsl:apply-templates/>
   </xsl:copy>
 </xsl:template>

and add an xsl:if around the copy such that if all three conditions are true (no attributes/values/subnodes) then a copy will be done

xsl: if example:  <xsl:if test="(@author = 'bd') or (@year='1667')">




Gertone (Geert Bormans)Information ArchitectCommented:
mmh Bob, not sure you ll get there that way
If I look at the example you also want to remove an element that would be empty after all the descendants have been removed
You have to test content in decendants AND you want to ignore white-space only nodes

The following does the job, with your example
(and with a more complex self-test)

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
    <xsl:template
        match="node()[ descendant-or-self::*[@*] or  descendant-or-self::*[string-length(normalize-space(.)) &gt; 0]]">
        <xsl:copy>
            <xsl:apply-templates select="@*"/>
            <xsl:apply-templates select="node()"/>
        </xsl:copy>    
    </xsl:template>
    <xsl:template match="text()">
        <xsl:value-of select="."/>
    </xsl:template>
    <xsl:template match="@*">
        <xsl:attribute name="{name()}"><xsl:value-of select="."/>
        </xsl:attribute>
    </xsl:template>
</xsl:stylesheet>

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Gertone (Geert Bormans)Information ArchitectCommented:
words on the script
----------------------
The heart of this piece of script is the XPath in the match of the first template.
That XPath matches any node that matches one of the following rules
- somewhere inside (descendant-or-self) there is an element with an attribute
- somewhere inside (descendant-or-self) there is an element with content that has a length of 1 or more, after white-space normalisation
I ignore all the other elements

The other two templates tell me that I can safely copy all text-nodes and all attributes

works also in this bordercase:
<?xml version="1.0" encoding="UTF-8"?>
<root><smooth><empty><series></series></empty></smooth></root>

What do you do yourself here?
----------------------------------
- I have not tested the behaviour with comments and PIs, you might need some twiddling
- I have not bothered about avoiding the creation of spurious white-space nodes

Well, you should have some fun yourself, no?
Good luck

Gertone
Amazon Web Services

Are you thinking about creating an Amazon Web Services account for your business? Not sure where to start? In this course you’ll get an overview of the history of AWS and take a tour of their user interface.

Gertone (Geert Bormans)Information ArchitectCommented:
Well,

could not resist testing with PI and comments and it ignores them as expected
<empty><!-- my comment --></empty> is still an empty element...
so if that is what you want, you can leave it like that

If you add the strip-space element like this at the beginning
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
    <xsl:strip-space elements="*"/>
    <xsl:template
        match="node()[ descendant-or-self::*[@*] or  descendant-or-self::*[string-length(normalize-space(.)) &gt; 0]]">
        <xsl:copy>
all "pretty-print" white-space nodes are removed, you will get your XML on one line, which you can "pretty-print" again with XML-Spy or Oxygen.

One WARNING though, if you do that:
"<test>some<m>mixed </m>content and some mixed white-space<m> </m> </test>"
The second element <m> contains only a white-space (in mixed content) it will be removed since the white-space only node will be removed
so you will loose that.
One can argue that the white-space node after the second (mixed-content) element <m> is not to be thrown away, but it is removed as well,
as in  "<test>word <m>bold</m> <m>italic</m></test>" you will loose the space between the two (mixed content) elements <m>
Bottomline: If you have a lot of mixed content, be careful with the strip-space,
or include mixed-content elements in an <xsl:preserve-space elements="test firstMixed m"/> if you know who they are.

And now I should shut up :-)
SiJPAuthor Commented:
Ah Gertone, no need to shut up by any means - this is excellant information! When I get back in to me office, I shall be having fun with your xslt!
SiJPAuthor Commented:
Gertone,

What script changes would be needed to also exclude elements like:

<SomeElement myAtt=""></SomeElement>

(e.g. elemts that have a blank attribute as well as no data)?

Thanks!
SiJPAuthor Commented:
Oh, and also...

<SomeElement myAtt=""/>


:)
Gertone (Geert Bormans)Information ArchitectCommented:
You are a real challenger hé!? :-)

Just change the XPath in
        match="node()[ descendant-or-self::*[@* != ''] or  descendant-or-self::*[string-length(normalize-space(.)) &gt; 0]]"
and it should work
Gertone (Geert Bormans)Information ArchitectCommented:
It might not be clear from the copy,
but I changed the [@*] into [@* != '']
and that is 2 single quotes, not one double quote
So I check that any attribute is not the empty string, and in the meantime it gives a false for none existing attributes as well
SiJPAuthor Commented:
Clearly not a challenge that is too much for you!

Thank you Gertone..  I will try this again and let you know of my results

Si
SiJPAuthor Commented:
My friend, this has helped me enormously, thank you!

Final XSLT:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
 <xsl:output method="xml" encoding="UTF-8" indent="yes"/>
    <xsl:template
        match="node()[ descendant-or-self::*[@* != ''] or  descendant-or-self::*[string-length(normalize-space(.)) &gt; 0]]">
        <xsl:copy>
            <xsl:apply-templates select="@*"/>
            <xsl:apply-templates select="node()"/>
        </xsl:copy>    
    </xsl:template>
    <xsl:template match="text()">
        <xsl:value-of select="."/>
    </xsl:template>
    <xsl:template match="@*">
        <xsl:attribute name="{name()}"><xsl:value-of select="."/>
        </xsl:attribute>
    </xsl:template>
</xsl:stylesheet>
Gertone (Geert Bormans)Information ArchitectCommented:
You are welcome
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Web Languages and Standards

From novice to tech pro — start learning today.