Go Premium for a chance to win a PS4. Enter to Win

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 1099
  • Last Modified:

Parse XML file and find nodes with a special attribute

Hi,

I've SVG documents that looks like this:
<svg>
   <svg xmlns="test">
       <svg xmlns="sbutest1"></svg>
       <svg xmlns="sbutest2"></svg>
  </svg>
   <svg xmlns="test2">
       <svg xmlns="sbutest3"></svg>
   </svg>
</svg>

Now I've to parse it and save the last childs into own files (subtest1.xml, subtest2.xml) etc.
Any idea how to do this?

Thanks,

Andre
0
andre72
Asked:
andre72
  • 6
  • 5
1 Solution
 
abelCommented:
Not sure what the "this" refers to in that last sentence. You say that you parsed it and saved them in separate files. What is your next task that you have trouble with? Or did you have problems with that task, if so, what? Can you show your C# code and point to the place where it goes wrong or that you have trouble with?
0
 
dericstoneCommented:
The code below should get you started.
XmlDocument doc = new XmlDocument();
doc.LoadXml(xml);
foreach (XmlNode node1 in doc.ChildNodes)
  foreach (XmlNode node2 in node1.ChildNodes)
    foreach (XmlNode node3 in node2.ChildNodes)
      // save node 3 to a file

Open in new window

0
 
andre72Author Commented:
Thanks dericstone, that nearly what I'm looking for . but what I show was just an example, more child nodes are also possible, eg.

   <svg xmlns="test">
       <svg xmlns="sbutest1">
            <svg xmlns="subsubtest1">
                <svg xmlns="subsubsubtest1"></svg>
                 <svg xmlns="subsubsubtest2">
                         <svg xmlns="subsubsubsubtest2 /">
                 </svg>
           </svg>
      </svg>

would also be possible ...
       <svg xmlns="sbutest2"></svg>
  </svg>
0
Concerto's Cloud Advisory Services

Want to avoid the missteps to gaining all the benefits of the cloud? Learn more about the different assessment options from our Cloud Advisory team.

 
andre72Author Commented:
Sorry, copy past mistake ;-)
0
 
abelCommented:
andre, I asked my questions in my first comment with a reason. Your code shows a very unlikely scenario for svg data (is it svg at all?). You ask for attributes, but there is not a single attribute in your XML code, instead, there are only namespace attributes which, despite the name, or not attributes.

Using regular foreach loops is not the approach you will be after. Instead, either LINQ to XML or SelectNodes / XpathSelectElements etc is more an approach you should be after.

Please, take the time to give us a better insight in what you want so we can help you to the point.
0
 
andre72Author Commented:
I'm sorry about abel, you're right, I'd been a little bit in a hurry when I did my first article.
Well and as I'm not good with SVG files I thought xmlns="" is an attribute like in "normal" xml files...
Ok, here we go again...
The xml (you're right, is ever svg) looks like svg do:
   <svg xmlns="test">
       <svg xmlns="sbutest1">
            <svg xmlns="subsubtest1">
                <svg xmlns="subsubsubtest1"></svg>
                 <svg xmlns="subsubsubtest2">
                         <svg xmlns="subsubsubsubtest2 /">
                 </svg>
           </svg>
      </svg>
       <svg xmlns="sbutest2"></svg>
  </svg>

So I've to read it recursive but only the nodes with xmlns="xyz" are needed.
If I get one I need to save it with any child nodes.
I also include a example (just testing for recursive working), this is a little bit mystic (for me):
It works at all, but doc.Load(file); takes about 10 seconds to load for a 4kb SVG file.
Well, with doc.XmlResolver = null; is much faster but no more resursive call than?!?
Also I'm not sure about if this is really a good solution for as I'm a novice with xml...
Thanks,

Andre
XmlDocument doc = new XmlDocument();
doc.XmlResolver = null;
doc.Load(file);
GetNode(doc.DocumentElement);
 
        private void GetNode(XmlNode inXmlNode)
        {
 
            XmlAttributeCollection xmlAttrs = inXmlNode.Attributes;
            XmlNode xmlAttr = xmlAttrs.GetNamedItem("xmlns");
 
            if (inXmlNode.HasChildNodes && xmlAttr!=null)
            {
                Console.WriteLine((inXmlNode.OuterXml).Trim());
                nodeList = inXmlNode.ChildNodes;
                for (i = 0; i <= nodeList.Count - 1; i++)
                {
                    xNode = inXmlNode.ChildNodes[i];
                    GetNode(xNode);
                }
            }
            else
            {
                if (xmlAttr != null)
                {
                    Console.WriteLine((inXmlNode.OuterXml).Trim());
                }
            }
        }  

Open in new window

0
 
andre72Author Commented:
Arggs, again an error - xmlns for sure is not ever given
<svg id="test">
       <svg name="sbutest1">
            <svg xmlns="subsubtest1"> <!-- save from here 1 -->
                <svg"></svg>
                 <svg xmlns="subsubsubtest2"> <!-- save from here 2 -->
                         <svg xmlns="subsubsubsubtest2 /">
                 </svg> <!-- to here 2 -->
           </svg> <!-- to here 1 -->
      </svg>
       <svg xmlns="sbutest2"></svg>
  </svg>
0
 
abelCommented:
I'm quite surprised still about the structure of your svg. I have to take your word for it that it looks the way it does, and I assume for now that what you put inside the xmlns-attributes (namespace attributes) is something starting with "http:" or "urn:". If not, the file is not XML + Namespaces compliant and parsers should raise an error (but in the case of the xmlns attribute, they can be lenient).

Normal SVG files have a structure like in the code example below (from http://www.w3schools.com/svg/radial2.svg). As you can see, it only has one xmlns attribute. It is allowed that the attribute is repeated, but the parts that have a different namespace (i.e., a different attribute value) are not part of the SVG spec and cannot be parsed as SVG.

Your problem in general can be best attacked with XSLT. I'll come up shortly (not sure if it'll be tonight) with an example in both C# and XSLT, which does what you want: get every element + child nodes that have a certain namespace.

-- Abel --

PS: your code is not working because you are asking for an attribute, and there isn't any. You cannot "just" ask for a node with a certain namespace, because a namespace is a scope and starts on the element where it is specified. That means, that elements not having the xmlns attribute specifically, can still be part of the result of your search. This is in the nature of XML and cannot be changed.


<?xml version="1.0" standalone="no"?>
<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN"
"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
 
<svg width="100%" height="100%" version="1.1"
xmlns="http://www.w3.org/2000/svg">
 
    <defs>
        <radialGradient id="grey_blue" cx="20%" cy="40%" r="50%" fx="50%" fy="50%">
            <stop offset="0%" style="stop-color:rgb(200,200,200);stop-opacity:0"/>
            <stop offset="100%" style="stop-color:rgb(0,0,255);stop-opacity:1"/>
        </radialGradient>
    </defs>
 
    <ellipse cx="230" cy="200" rx="110" ry="100"
    style="fill:url(#grey_blue)"/>
 
</svg>

Open in new window

0
 
abelCommented:
I created a little XSLT that does the job for you. It is really quite simple, but XSLT is said to have a rather steep learning curve. That's not entirely true, but it requires a different way of thinking that is best thought by either a teacher of a good intro book. You can also check online for some tutorials, but they do not go too deep.

Since your input XML is so odd and your requirements are so non-standard, I give you the following so you have a starting point, but I assume that your real situation requires something different. I hope it gets you going... ;-)


<!-- INPUT (after I made it compliant) -->
<?xml version="1.0" encoding="utf-8"?>
<svg id="test">
    <svg name="sbutest1">
        <svg xmlns="subsubtest1"><!-- save from here 1 -->
            <svg></svg>
            <svg xmlns="subsubsubtest2"><!-- save from here 2 -->
                <svg xmlns="subsubsubsubtest2" />
            </svg><!-- to here 2 -->
        </svg><!-- to here 1 -->
    </svg>
    <svg xmlns="sbutest2"></svg>
</svg>
 
 
<!-- OUTPUT (the output-text is not mandatory, of course) -->
<?xml version="1.0" encoding="utf-8"?>
<output xmlns:subns1="subsubtest1" xmlns:subns2="subsubsubtest2">
    <svg xmlns="subsubtest1"><!-- save from here 1 -->
        <svg />
        <svg xmlns="subsubsubtest2">
            <!-- save from here 2 -->
            <svg xmlns="subsubsubsubtest2" />
        </svg><!-- to here 2 -->
    </svg>
    <svg xmlns="subsubsubtest2"><!-- save from here 2 -->
        <svg xmlns="subsubsubsubtest2" />
    </svg>
</output>
 
 
<!-- the XSLT, that does all the work, simple but effective -->
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0"
                xmlns:subns1="subsubtest1"
                xmlns:subns2="subsubsubtest2"
                xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 
    <xsl:output indent="yes"/>
 
    <!-- the starting point -->
    <xsl:template match="/">
        <output>
            <xsl:apply-templates />
        </output>
    </xsl:template>
    
    <!-- the core is this block -->
    <xsl:template match="subns1:svg | subns2:svg">
        <xsl:copy-of select="." />
        
        <!-- try not to match the same namespace if this has childnodes without any -->
        <xsl:apply-templates select="*[namespace-uri() != namespace-uri(current())]" />
    </xsl:template>
 
    <!-- do not do anything when there's no math -->
    <xsl:template match="node()" >
        <xsl:apply-templates />
    </xsl:template>
</xsl:stylesheet>

Open in new window

0
 
abelCommented:
Here's the code in C#. It is really that simple. Just change the paths to how you have it now. Make sure to reference System.Xml

// simplest way of transforming XML with an XSLT stylesheet
 
XslCompiledTransform xslt = new XslCompiledTransform(true);
xslt.Load("transform.xslt");
xslt.Transform("input.xml", "output.xml");

Open in new window

0
 
andre72Author Commented:
abel, this is really a great solution and idea for it! and at all I learned a lot about. thanks!
0
 
abelCommented:
You're welcome, glad it helped
0

Featured Post

Nothing ever in the clear!

This technical paper will help you implement VMware’s VM encryption as well as implement Veeam encryption which together will achieve the nothing ever in the clear goal. If a bad guy steals VMs, backups or traffic they get nothing.

  • 6
  • 5
Tackle projects and never again get stuck behind a technical roadblock.
Join Now