Solved

Parse XML file and find nodes with a special attribute

Posted on 2009-07-02
12
1,087 Views
Last Modified: 2013-12-17
Hi,

I've SVG documents that looks like this:
<svg>
   <svg xmlns="test">
       <svg xmlns="sbutest1"></svg>
       <svg xmlns="sbutest2"></svg>
  </svg>
   <svg xmlns="test2">
       <svg xmlns="sbutest3"></svg>
   </svg>
</svg>

Now I've to parse it and save the last childs into own files (subtest1.xml, subtest2.xml) etc.
Any idea how to do this?

Thanks,

Andre
0
Comment
Question by:andre72
  • 6
  • 5
12 Comments
 
LVL 39

Expert Comment

by:abel
Comment Utility
Not sure what the "this" refers to in that last sentence. You say that you parsed it and saved them in separate files. What is your next task that you have trouble with? Or did you have problems with that task, if so, what? Can you show your C# code and point to the place where it goes wrong or that you have trouble with?
0
 
LVL 8

Expert Comment

by:dericstone
Comment Utility
The code below should get you started.
XmlDocument doc = new XmlDocument();

doc.LoadXml(xml);

foreach (XmlNode node1 in doc.ChildNodes)

  foreach (XmlNode node2 in node1.ChildNodes)

    foreach (XmlNode node3 in node2.ChildNodes)

      // save node 3 to a file

Open in new window

0
 

Author Comment

by:andre72
Comment Utility
Thanks dericstone, that nearly what I'm looking for . but what I show was just an example, more child nodes are also possible, eg.

   <svg xmlns="test">
       <svg xmlns="sbutest1">
            <svg xmlns="subsubtest1">
                <svg xmlns="subsubsubtest1"></svg>
                 <svg xmlns="subsubsubtest2">
                         <svg xmlns="subsubsubsubtest2 /">
                 </svg>
           </svg>
      </svg>

would also be possible ...
       <svg xmlns="sbutest2"></svg>
  </svg>
0
 

Author Comment

by:andre72
Comment Utility
Sorry, copy past mistake ;-)
0
 
LVL 39

Expert Comment

by:abel
Comment Utility
andre, I asked my questions in my first comment with a reason. Your code shows a very unlikely scenario for svg data (is it svg at all?). You ask for attributes, but there is not a single attribute in your XML code, instead, there are only namespace attributes which, despite the name, or not attributes.

Using regular foreach loops is not the approach you will be after. Instead, either LINQ to XML or SelectNodes / XpathSelectElements etc is more an approach you should be after.

Please, take the time to give us a better insight in what you want so we can help you to the point.
0
 

Author Comment

by:andre72
Comment Utility
I'm sorry about abel, you're right, I'd been a little bit in a hurry when I did my first article.
Well and as I'm not good with SVG files I thought xmlns="" is an attribute like in "normal" xml files...
Ok, here we go again...
The xml (you're right, is ever svg) looks like svg do:
   <svg xmlns="test">
       <svg xmlns="sbutest1">
            <svg xmlns="subsubtest1">
                <svg xmlns="subsubsubtest1"></svg>
                 <svg xmlns="subsubsubtest2">
                         <svg xmlns="subsubsubsubtest2 /">
                 </svg>
           </svg>
      </svg>
       <svg xmlns="sbutest2"></svg>
  </svg>

So I've to read it recursive but only the nodes with xmlns="xyz" are needed.
If I get one I need to save it with any child nodes.
I also include a example (just testing for recursive working), this is a little bit mystic (for me):
It works at all, but doc.Load(file); takes about 10 seconds to load for a 4kb SVG file.
Well, with doc.XmlResolver = null; is much faster but no more resursive call than?!?
Also I'm not sure about if this is really a good solution for as I'm a novice with xml...
Thanks,

Andre
XmlDocument doc = new XmlDocument();

doc.XmlResolver = null;

doc.Load(file);

GetNode(doc.DocumentElement);
 

        private void GetNode(XmlNode inXmlNode)

        {
 

            XmlAttributeCollection xmlAttrs = inXmlNode.Attributes;

            XmlNode xmlAttr = xmlAttrs.GetNamedItem("xmlns");
 

            if (inXmlNode.HasChildNodes && xmlAttr!=null)

            {

                Console.WriteLine((inXmlNode.OuterXml).Trim());

                nodeList = inXmlNode.ChildNodes;

                for (i = 0; i <= nodeList.Count - 1; i++)

                {

                    xNode = inXmlNode.ChildNodes[i];

                    GetNode(xNode);

                }

            }

            else

            {

                if (xmlAttr != null)

                {

                    Console.WriteLine((inXmlNode.OuterXml).Trim());

                }

            }

        }  

Open in new window

0
Free Trending Threat Insights Every Day

Enhance your security with threat intelligence from the web. Get trending threat insights on hackers, exploits, and suspicious IP addresses delivered to your inbox with our free Cyber Daily.

 

Author Comment

by:andre72
Comment Utility
Arggs, again an error - xmlns for sure is not ever given
<svg id="test">
       <svg name="sbutest1">
            <svg xmlns="subsubtest1"> <!-- save from here 1 -->
                <svg"></svg>
                 <svg xmlns="subsubsubtest2"> <!-- save from here 2 -->
                         <svg xmlns="subsubsubsubtest2 /">
                 </svg> <!-- to here 2 -->
           </svg> <!-- to here 1 -->
      </svg>
       <svg xmlns="sbutest2"></svg>
  </svg>
0
 
LVL 39

Expert Comment

by:abel
Comment Utility
I'm quite surprised still about the structure of your svg. I have to take your word for it that it looks the way it does, and I assume for now that what you put inside the xmlns-attributes (namespace attributes) is something starting with "http:" or "urn:". If not, the file is not XML + Namespaces compliant and parsers should raise an error (but in the case of the xmlns attribute, they can be lenient).

Normal SVG files have a structure like in the code example below (from http://www.w3schools.com/svg/radial2.svg). As you can see, it only has one xmlns attribute. It is allowed that the attribute is repeated, but the parts that have a different namespace (i.e., a different attribute value) are not part of the SVG spec and cannot be parsed as SVG.

Your problem in general can be best attacked with XSLT. I'll come up shortly (not sure if it'll be tonight) with an example in both C# and XSLT, which does what you want: get every element + child nodes that have a certain namespace.

-- Abel --

PS: your code is not working because you are asking for an attribute, and there isn't any. You cannot "just" ask for a node with a certain namespace, because a namespace is a scope and starts on the element where it is specified. That means, that elements not having the xmlns attribute specifically, can still be part of the result of your search. This is in the nature of XML and cannot be changed.


<?xml version="1.0" standalone="no"?>

<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN"

"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
 

<svg width="100%" height="100%" version="1.1"

xmlns="http://www.w3.org/2000/svg">
 

    <defs>

        <radialGradient id="grey_blue" cx="20%" cy="40%" r="50%" fx="50%" fy="50%">

            <stop offset="0%" style="stop-color:rgb(200,200,200);stop-opacity:0"/>

            <stop offset="100%" style="stop-color:rgb(0,0,255);stop-opacity:1"/>

        </radialGradient>

    </defs>
 

    <ellipse cx="230" cy="200" rx="110" ry="100"

    style="fill:url(#grey_blue)"/>
 

</svg>

Open in new window

0
 
LVL 39

Accepted Solution

by:
abel earned 500 total points
Comment Utility
I created a little XSLT that does the job for you. It is really quite simple, but XSLT is said to have a rather steep learning curve. That's not entirely true, but it requires a different way of thinking that is best thought by either a teacher of a good intro book. You can also check online for some tutorials, but they do not go too deep.

Since your input XML is so odd and your requirements are so non-standard, I give you the following so you have a starting point, but I assume that your real situation requires something different. I hope it gets you going... ;-)


<!-- INPUT (after I made it compliant) -->

<?xml version="1.0" encoding="utf-8"?>

<svg id="test">

    <svg name="sbutest1">

        <svg xmlns="subsubtest1"><!-- save from here 1 -->

            <svg></svg>

            <svg xmlns="subsubsubtest2"><!-- save from here 2 -->

                <svg xmlns="subsubsubsubtest2" />

            </svg><!-- to here 2 -->

        </svg><!-- to here 1 -->

    </svg>

    <svg xmlns="sbutest2"></svg>

</svg>
 
 

<!-- OUTPUT (the output-text is not mandatory, of course) -->

<?xml version="1.0" encoding="utf-8"?>

<output xmlns:subns1="subsubtest1" xmlns:subns2="subsubsubtest2">

    <svg xmlns="subsubtest1"><!-- save from here 1 -->

        <svg />

        <svg xmlns="subsubsubtest2">

            <!-- save from here 2 -->

            <svg xmlns="subsubsubsubtest2" />

        </svg><!-- to here 2 -->

    </svg>

    <svg xmlns="subsubsubtest2"><!-- save from here 2 -->

        <svg xmlns="subsubsubsubtest2" />

    </svg>

</output>
 
 

<!-- the XSLT, that does all the work, simple but effective -->

<?xml version="1.0" encoding="utf-8"?>

<xsl:stylesheet version="1.0"

                xmlns:subns1="subsubtest1"

                xmlns:subns2="subsubsubtest2"

                xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 

    <xsl:output indent="yes"/>
 

    <!-- the starting point -->

    <xsl:template match="/">

        <output>

            <xsl:apply-templates />

        </output>

    </xsl:template>

    

    <!-- the core is this block -->

    <xsl:template match="subns1:svg | subns2:svg">

        <xsl:copy-of select="." />

        

        <!-- try not to match the same namespace if this has childnodes without any -->

        <xsl:apply-templates select="*[namespace-uri() != namespace-uri(current())]" />

    </xsl:template>
 

    <!-- do not do anything when there's no math -->

    <xsl:template match="node()" >

        <xsl:apply-templates />

    </xsl:template>

</xsl:stylesheet>

Open in new window

0
 
LVL 39

Expert Comment

by:abel
Comment Utility
Here's the code in C#. It is really that simple. Just change the paths to how you have it now. Make sure to reference System.Xml

// simplest way of transforming XML with an XSLT stylesheet
 

XslCompiledTransform xslt = new XslCompiledTransform(true);

xslt.Load("transform.xslt");

xslt.Transform("input.xml", "output.xml");

Open in new window

0
 

Author Closing Comment

by:andre72
Comment Utility
abel, this is really a great solution and idea for it! and at all I learned a lot about. thanks!
0
 
LVL 39

Expert Comment

by:abel
Comment Utility
You're welcome, glad it helped
0

Featured Post

Why You Should Analyze Threat Actor TTPs

After years of analyzing threat actor behavior, it’s become clear that at any given time there are specific tactics, techniques, and procedures (TTPs) that are particularly prevalent. By analyzing and understanding these TTPs, you can dramatically enhance your security program.

Join & Write a Comment

The object model of .Net can be overwhelming at times – so overwhelming that quite trivial tasks often take hours of research. In this case, the task at hand was to populate the datagrid from SQL Server database in Visual Studio 2008 Windows applica…
Welcome my friends to the second instalment and follow-up to our Minify and Concatenate Your Scripts and Stylesheets (http://www.experts-exchange.com/Programming/Languages/.NET/ASP.NET/A_4334-Minify-and-Concatenate-Your-Scripts-and-Stylesheets.html)…
When you create an app prototype with Adobe XD, you can insert system screens -- sharing or Control Center, for example -- with just a few clicks. This video shows you how. You can take the full course on Experts Exchange at http://bit.ly/XDcourse.
This video explains how to create simple products associated to Magento configurable product and offers fast way of their generation with Store Manager for Magento tool.

771 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

10 Experts available now in Live!

Get 1:1 Help Now