Solved

Parse XML file and find nodes with a special attribute

Posted on 2009-07-02
12
1,088 Views
Last Modified: 2013-12-17
Hi,

I've SVG documents that looks like this:
<svg>
   <svg xmlns="test">
       <svg xmlns="sbutest1"></svg>
       <svg xmlns="sbutest2"></svg>
  </svg>
   <svg xmlns="test2">
       <svg xmlns="sbutest3"></svg>
   </svg>
</svg>

Now I've to parse it and save the last childs into own files (subtest1.xml, subtest2.xml) etc.
Any idea how to do this?

Thanks,

Andre
0
Comment
Question by:andre72
  • 6
  • 5
12 Comments
 
LVL 39

Expert Comment

by:abel
ID: 24762353
Not sure what the "this" refers to in that last sentence. You say that you parsed it and saved them in separate files. What is your next task that you have trouble with? Or did you have problems with that task, if so, what? Can you show your C# code and point to the place where it goes wrong or that you have trouble with?
0
 
LVL 8

Expert Comment

by:dericstone
ID: 24766919
The code below should get you started.
XmlDocument doc = new XmlDocument();

doc.LoadXml(xml);

foreach (XmlNode node1 in doc.ChildNodes)

  foreach (XmlNode node2 in node1.ChildNodes)

    foreach (XmlNode node3 in node2.ChildNodes)

      // save node 3 to a file

Open in new window

0
 

Author Comment

by:andre72
ID: 24767296
Thanks dericstone, that nearly what I'm looking for . but what I show was just an example, more child nodes are also possible, eg.

   <svg xmlns="test">
       <svg xmlns="sbutest1">
            <svg xmlns="subsubtest1">
                <svg xmlns="subsubsubtest1"></svg>
                 <svg xmlns="subsubsubtest2">
                         <svg xmlns="subsubsubsubtest2 /">
                 </svg>
           </svg>
      </svg>

would also be possible ...
       <svg xmlns="sbutest2"></svg>
  </svg>
0
 

Author Comment

by:andre72
ID: 24767301
Sorry, copy past mistake ;-)
0
 
LVL 39

Expert Comment

by:abel
ID: 24767426
andre, I asked my questions in my first comment with a reason. Your code shows a very unlikely scenario for svg data (is it svg at all?). You ask for attributes, but there is not a single attribute in your XML code, instead, there are only namespace attributes which, despite the name, or not attributes.

Using regular foreach loops is not the approach you will be after. Instead, either LINQ to XML or SelectNodes / XpathSelectElements etc is more an approach you should be after.

Please, take the time to give us a better insight in what you want so we can help you to the point.
0
 

Author Comment

by:andre72
ID: 24767984
I'm sorry about abel, you're right, I'd been a little bit in a hurry when I did my first article.
Well and as I'm not good with SVG files I thought xmlns="" is an attribute like in "normal" xml files...
Ok, here we go again...
The xml (you're right, is ever svg) looks like svg do:
   <svg xmlns="test">
       <svg xmlns="sbutest1">
            <svg xmlns="subsubtest1">
                <svg xmlns="subsubsubtest1"></svg>
                 <svg xmlns="subsubsubtest2">
                         <svg xmlns="subsubsubsubtest2 /">
                 </svg>
           </svg>
      </svg>
       <svg xmlns="sbutest2"></svg>
  </svg>

So I've to read it recursive but only the nodes with xmlns="xyz" are needed.
If I get one I need to save it with any child nodes.
I also include a example (just testing for recursive working), this is a little bit mystic (for me):
It works at all, but doc.Load(file); takes about 10 seconds to load for a 4kb SVG file.
Well, with doc.XmlResolver = null; is much faster but no more resursive call than?!?
Also I'm not sure about if this is really a good solution for as I'm a novice with xml...
Thanks,

Andre
XmlDocument doc = new XmlDocument();

doc.XmlResolver = null;

doc.Load(file);

GetNode(doc.DocumentElement);
 

        private void GetNode(XmlNode inXmlNode)

        {
 

            XmlAttributeCollection xmlAttrs = inXmlNode.Attributes;

            XmlNode xmlAttr = xmlAttrs.GetNamedItem("xmlns");
 

            if (inXmlNode.HasChildNodes && xmlAttr!=null)

            {

                Console.WriteLine((inXmlNode.OuterXml).Trim());

                nodeList = inXmlNode.ChildNodes;

                for (i = 0; i <= nodeList.Count - 1; i++)

                {

                    xNode = inXmlNode.ChildNodes[i];

                    GetNode(xNode);

                }

            }

            else

            {

                if (xmlAttr != null)

                {

                    Console.WriteLine((inXmlNode.OuterXml).Trim());

                }

            }

        }  

Open in new window

0
3 Use Cases for Connected Systems

Our Dev teams are like yours. They’re continually cranking out code for new features/bugs fixes, testing, deploying, testing some more, responding to production monitoring events and more. It’s complex. So, we thought you’d like to see what’s working for us.

 

Author Comment

by:andre72
ID: 24768022
Arggs, again an error - xmlns for sure is not ever given
<svg id="test">
       <svg name="sbutest1">
            <svg xmlns="subsubtest1"> <!-- save from here 1 -->
                <svg"></svg>
                 <svg xmlns="subsubsubtest2"> <!-- save from here 2 -->
                         <svg xmlns="subsubsubsubtest2 /">
                 </svg> <!-- to here 2 -->
           </svg> <!-- to here 1 -->
      </svg>
       <svg xmlns="sbutest2"></svg>
  </svg>
0
 
LVL 39

Expert Comment

by:abel
ID: 24768435
I'm quite surprised still about the structure of your svg. I have to take your word for it that it looks the way it does, and I assume for now that what you put inside the xmlns-attributes (namespace attributes) is something starting with "http:" or "urn:". If not, the file is not XML + Namespaces compliant and parsers should raise an error (but in the case of the xmlns attribute, they can be lenient).

Normal SVG files have a structure like in the code example below (from http://www.w3schools.com/svg/radial2.svg). As you can see, it only has one xmlns attribute. It is allowed that the attribute is repeated, but the parts that have a different namespace (i.e., a different attribute value) are not part of the SVG spec and cannot be parsed as SVG.

Your problem in general can be best attacked with XSLT. I'll come up shortly (not sure if it'll be tonight) with an example in both C# and XSLT, which does what you want: get every element + child nodes that have a certain namespace.

-- Abel --

PS: your code is not working because you are asking for an attribute, and there isn't any. You cannot "just" ask for a node with a certain namespace, because a namespace is a scope and starts on the element where it is specified. That means, that elements not having the xmlns attribute specifically, can still be part of the result of your search. This is in the nature of XML and cannot be changed.


<?xml version="1.0" standalone="no"?>

<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN"

"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
 

<svg width="100%" height="100%" version="1.1"

xmlns="http://www.w3.org/2000/svg">
 

    <defs>

        <radialGradient id="grey_blue" cx="20%" cy="40%" r="50%" fx="50%" fy="50%">

            <stop offset="0%" style="stop-color:rgb(200,200,200);stop-opacity:0"/>

            <stop offset="100%" style="stop-color:rgb(0,0,255);stop-opacity:1"/>

        </radialGradient>

    </defs>
 

    <ellipse cx="230" cy="200" rx="110" ry="100"

    style="fill:url(#grey_blue)"/>
 

</svg>

Open in new window

0
 
LVL 39

Accepted Solution

by:
abel earned 500 total points
ID: 24769049
I created a little XSLT that does the job for you. It is really quite simple, but XSLT is said to have a rather steep learning curve. That's not entirely true, but it requires a different way of thinking that is best thought by either a teacher of a good intro book. You can also check online for some tutorials, but they do not go too deep.

Since your input XML is so odd and your requirements are so non-standard, I give you the following so you have a starting point, but I assume that your real situation requires something different. I hope it gets you going... ;-)


<!-- INPUT (after I made it compliant) -->

<?xml version="1.0" encoding="utf-8"?>

<svg id="test">

    <svg name="sbutest1">

        <svg xmlns="subsubtest1"><!-- save from here 1 -->

            <svg></svg>

            <svg xmlns="subsubsubtest2"><!-- save from here 2 -->

                <svg xmlns="subsubsubsubtest2" />

            </svg><!-- to here 2 -->

        </svg><!-- to here 1 -->

    </svg>

    <svg xmlns="sbutest2"></svg>

</svg>
 
 

<!-- OUTPUT (the output-text is not mandatory, of course) -->

<?xml version="1.0" encoding="utf-8"?>

<output xmlns:subns1="subsubtest1" xmlns:subns2="subsubsubtest2">

    <svg xmlns="subsubtest1"><!-- save from here 1 -->

        <svg />

        <svg xmlns="subsubsubtest2">

            <!-- save from here 2 -->

            <svg xmlns="subsubsubsubtest2" />

        </svg><!-- to here 2 -->

    </svg>

    <svg xmlns="subsubsubtest2"><!-- save from here 2 -->

        <svg xmlns="subsubsubsubtest2" />

    </svg>

</output>
 
 

<!-- the XSLT, that does all the work, simple but effective -->

<?xml version="1.0" encoding="utf-8"?>

<xsl:stylesheet version="1.0"

                xmlns:subns1="subsubtest1"

                xmlns:subns2="subsubsubtest2"

                xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 

    <xsl:output indent="yes"/>
 

    <!-- the starting point -->

    <xsl:template match="/">

        <output>

            <xsl:apply-templates />

        </output>

    </xsl:template>

    

    <!-- the core is this block -->

    <xsl:template match="subns1:svg | subns2:svg">

        <xsl:copy-of select="." />

        

        <!-- try not to match the same namespace if this has childnodes without any -->

        <xsl:apply-templates select="*[namespace-uri() != namespace-uri(current())]" />

    </xsl:template>
 

    <!-- do not do anything when there's no math -->

    <xsl:template match="node()" >

        <xsl:apply-templates />

    </xsl:template>

</xsl:stylesheet>

Open in new window

0
 
LVL 39

Expert Comment

by:abel
ID: 24769061
Here's the code in C#. It is really that simple. Just change the paths to how you have it now. Make sure to reference System.Xml

// simplest way of transforming XML with an XSLT stylesheet
 

XslCompiledTransform xslt = new XslCompiledTransform(true);

xslt.Load("transform.xslt");

xslt.Transform("input.xml", "output.xml");

Open in new window

0
 

Author Closing Comment

by:andre72
ID: 31599128
abel, this is really a great solution and idea for it! and at all I learned a lot about. thanks!
0
 
LVL 39

Expert Comment

by:abel
ID: 24776835
You're welcome, glad it helped
0

Featured Post

Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Wouldn’t it be nice if you could test whether an element is contained in an array by using a Contains method just like the one available on List objects? Wouldn’t it be good if you could write code like this? (CODE) In .NET 3.5, this is possible…
A long time ago (May 2011), I have written an article showing you how to create a DLL using Visual Studio 2005 to be hosted in SQL Server 2005. That was valid at that time and it is still valid if you are still using these versions. You can still re…
This tutorial gives a high-level tour of the interface of Marketo (a marketing automation tool to help businesses track and engage prospective customers and drive them to purchase). You will see the main areas including Marketing Activities, Design …
In this video I am going to show you how to back up and restore Office 365 mailboxes using CodeTwo Backup for Office 365. Learn more about the tool used in this video here: http://www.codetwo.com/backup-for-office-365/ (http://www.codetwo.com/ba…

910 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

22 Experts available now in Live!

Get 1:1 Help Now