• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 214
  • Last Modified:

convert xml file to csv

I have some Excel files with elements and attributes all of different values.
I would like to have Elements and attributes as column headings.  Where attributes are used as headings the element name should proceed the attribute name with an _ separating them.
The value of the element should have the element name followed by the word "value in the header as in AdminStatus_Value in the example below.

For example

<OrganizationId>4295856130</OrganizationId><AdminStatus effectiveFrom="2009-07-21T17:02:06">Published</AdminStatus><OrganizationName effectiveFrom="2005-08-18T04:00:00" organizationNameTypeCode="AKA" languageId="505062" organizationNameLocalNormalized="TX" sourceType="11" sourceLink="http://www.nyse.com/about/listed/tx.html" sourceUpdateDate="2012-04-03T07:22:36">TX</OrganizationName>

should have as headings:
OrganizationID,
AdminStatus_EffectiveFrom,
AdminStatusValue
OrganizationName_EffectiveFrom,
OrganizationName_organizationNameTypeCode,
OrganizationName_LanguageID,
OrganizationName_organizationNameLocalNormalized,
OrganizationName_SourceType
OrganizationName_SourceLink
OrganizationName_SourceUpdateDate
0
AlHal2
Asked:
AlHal2
  • 5
  • 3
1 Solution
 
Miguel OzSoftware EngineerCommented:
For guidance check this M$ link
I do not know the root element of your doco, but to access attributes use el.Attribute("effectiveFrom") or elements el.Element("OrganizationID"),
To save csv string use:
System.IO.File.WriteAllText("F:\\*******\\yourFilename.csv", csv);

For further help. we need an xml sample file to custom the link code but it should be straight forwards to code it.
0
 
AlHal2Author Commented:
Here is the first part of the file.

<?xml version='1.0' encoding='utf-8'?><env:ContentEnvelope majVers='3' minVers='2.3' pubStyle='Incremental' xmlns:env='http://data.schemas.tfn.thomson.com/Envelope/2008-05-01/' xmlns:cmn='http://data.schemas.financial.thomsonreuters.com/Common/2009-09-01/' xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance' xmlns='http://oa.schemas.tfn.thomson.com/Organization/2010-05-01/' xsi:schemaLocation='http://oa.schemas.tfn.thomson.com/Organization/2010-05-01/ OrganizationMaster.xsd'><env:Header><env:Info><env:Id>urn:uuid:843a1d51-445a-403e-b062-88430a6aff26</env:Id><env:TimeStamp>2015-01-19T13:54:23</env:TimeStamp></env:Info></env:Header><env:Body majVers='3' minVers='3.0' contentSet='OA'>
<env:ContentItem action="Overwrite"><env:Data xsi:type="OrganizationDataItem"><Organization entityCreatedDate="2009-07-21T17:02:06" entityModifiedDate="2015-01-19T05:31:48" isOrganizationVerified="true" isOrganizationManaged="true" isOrganizationFinancialLocalBranch="false" isOrganizationFinancialForeignBranch="false" isOrganizationOtherBranchandDivision="false" entityLastReviewedDate="2014-07-17T04:00:00" organizationTier="1"><OrganizationId>4295856130</OrganizationId><AdminStatus effectiveFrom="2009-07-21T17:02:06">Published</AdminStatus><OrganizationName effectiveFrom="2005-08-18T04:00:00" organizationNameTypeCode="AKA" languageId="505062" organizationNameLocalNormalized="TX"
0
 
AlHal2Author Commented:
This is the code I'm using.  I get this error message
Object reference not set to an instance of an object.

using (StreamWriter sw = new StreamWriter("C:\\OAOrganization\\OrganizationNameParsed2.txt"))
                {
                    XElement custOrd = XElement.Load("C:\\OAOrganization\\OrganizationName.txt");
                    string csv =
                        (from el in custOrd.Element("Body").Elements("ContentItem").Elements("Data").Elements("Organization").Elements("OrganizationName")
                         select
                             String.Format("{0},{1},{2},{3},{4}",
                                 (string)el.Attribute("effectiveFrom"),
                                 (string)el.Element("effectiveTo"),
                                 (string)el.Element("organizationNameTypeCode"),
                                 (string)el.Element("languageId"),
                             //(string)el.Element("organizationNameLocalNormalized"),
                             //(string)el.Element("FullAddress").Element("Address"),
                             //(string)el.Element("FullAddress").Element("City"),
                             //(string)el.Element("FullAddress").Element("Region"),
                             //(string)el.Element("FullAddress").Element("PostalCode"),
                             //(string)el.Element("FullAddress").Element("Country"),
                                 Environment.NewLine
                             )
                        )
                        .Aggregate(
                            new StringBuilder(),
                            (sb, s) => sb.Append(s),
                            sb => sb.ToString()
                        );
                    sw.WriteLine(csv);


                }

Open in new window

0
Cloud Class® Course: C++ 11 Fundamentals

This course will introduce you to C++ 11 and teach you about syntax fundamentals.

 
Miguel OzSoftware EngineerCommented:
Could you post a valid xml file (basically that contains a complete node or nodes of what you need)?
0
 
AlHal2Author Commented:
0
 
AlHal2Author Commented:
If the above is impossible then I have this code to put each element and attribute on a separate row.  The only thing is it gives the OrganizationID as published which is the value for AdminStatus.
using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Data;
using System.Drawing;
using System.Linq;
using System.Text;
using System.Windows.Forms;
using System.Xml;
using System.IO;
using System.Xml.Linq;

namespace ParseXML
{

    public partial class Form1 : Form
    {
        string FileText = "";
        int CharCt;
        string NodeName="";
        string OrgID;
        public Form1()
        {
            InitializeComponent();
        }

        private void Form1_Load(object sender, EventArgs e)
        {
            {
                string xmlNode = "";
                using (StreamReader sr = File.OpenText("C:\\OAOrganization\\File.txt"))
                {
                    using (StreamWriter sw = new StreamWriter("C:\\OAOrganization\\OrganizationNameParsed.txt"))
                    {
                        try
                        {
                            while (sr.Peek() >= 0)
                            {


                                xmlNode = sr.ReadToEnd();
                            }
                        }
                        catch (Exception ex)
                        {
                            lblErrs.Text = ex.Message;
                        }


                        XmlReader xReader = XmlReader.Create(new StringReader(xmlNode));
                        while (xReader.Read())
                        {
                            

                            switch (xReader.NodeType)
                            {

                                case XmlNodeType.Element:
                                    NodeName = "";
                                    //OrgID = "";
                                    if (xReader.Name.ToUpper() != "ENV:HEADER"
                                        && xReader.Name.ToUpper() != "ENV:CONTENTENVELOPE"
                                        && xReader.Name.ToUpper() != "ENV:ID"
                                        && xReader.Name.ToUpper() != "ENV:INFO"
                                        && xReader.Name.ToUpper() != "ENV:TIMESTAMP"
                                        && xReader.Name.ToUpper() != "ENV:BODY"
                                        && xReader.Name.ToUpper() != "ENV:CONTENTITEM"
                                        && xReader.Name.ToUpper() != "ENV:DATA"
                                        && xReader.Name.ToUpper() != "ORGANIZATION"
                                        && xReader.Name.ToUpper().Contains("ADMIN") == false
                                        && xReader.Name.ToUpper().Contains("ISPUBLIC") == false
                                        && xReader.Name.ToUpper().Contains("IPODATE") == false
                                        && xReader.Name.ToUpper().Contains("ADDRESS") == false
                                        && xReader.Name.ToUpper().Contains("PHONE") == false
                                        && xReader.Name.ToUpper().Contains("FOUNDED") == false
                                        && xReader.Name.ToUpper().Contains("JURISDICTION") == false
                                        && xReader.Name.ToUpper().Contains("SUBTYPE") == false
                                        && xReader.Name.ToUpper().Contains("PROVIDER") == false)
                                    {
                                        //sw.Write("<" + xReader.Name + ">|");

                                        //sw.Write("<" + xReader.Name + ">|" + xReader.ReadElementContentAsString() + "|</"+ xReader.Name + ">\r\n");
                                        
                                        //sw.Write(xReader.Name + "|" + xReader.Value + "|" + xReader.Name + "\r\n");
                                        NodeName = xReader.Name ;
                                        if (xReader.Name == "OrganizationId")
                                        {
                                            OrgID = xReader.ReadElementContentAsString();
                                        }
                                        for (int attInd = 0; attInd < xReader.AttributeCount; attInd++)
                                        {

                                            if (xReader.Name.ToUpper().Contains("ADDRESS") == false
                                        && xReader.Name.ToUpper().Contains("EFFECTIVE") == false
                                        && xReader.Name.ToUpper().Contains("ADMIN") == false
                                        && xReader.Name.ToUpper().Contains("SOURCE") == false
                                        && xReader.Name.ToUpper().Contains("LANGUAGE") == false
                                        && xReader.Name.ToUpper().Contains("ISPUBLIC") == false
                                        && xReader.Name.ToUpper().Contains("IPODATE") == false)
                                            
                                            {
                                                if (xReader.Value != "")
                                                {
                                                    sw.Write(OrgID + "|" + NodeName + "_" + xReader.Name + "|" + xReader.Value + "|" + NodeName + "_" + xReader.Name + "\r\n");
                                                    

                                                }

                                            }
                                            xReader.MoveToAttribute(attInd);
                                            //NodeName = xReader.Name;
                                        }
                                        
                                   
                                    }
                                    
                                    break;
                                case XmlNodeType.Text:
                                    //NodeName = xReader.Name;
                                    if (NodeName != "")
                                    {
                                        //xReader.
                                        sw.Write(OrgID + "|" + NodeName + "|" + xReader.Value  + "|" + NodeName);
                                        sw.Write("\r\n");
                                    }
                                    break;
                                case XmlNodeType.EndElement:
                                   
                                    break;
                            
                            }


                        }
                    }
                }
                using (StreamReader sr = File.OpenText("C:\\OAOrganization\\OrganizationNameParsed.txt"))
                {
                    using (StreamWriter sw = new StreamWriter("C:\\OAOrganization\\OrganizationNameParsed2.txt"))
                    {
                        try
                        {
                            while (sr.Peek() >= 0)
                            {


                                FileText = sr.ReadLine();
                                CharCt = FileText.IndexOf("<Organ");
                                if (CharCt > 0)
                                {
                                    FileText = FileText.Substring(CharCt, FileText.Length - CharCt);
                                    sw.Write(FileText);
                                    sw.Write("\r\n");
                                }
                            }
                        }
                        catch (Exception ex)
                        {
                            lblErrs.Text = ex.Message;
                        }
                    }
                }

                


            }
            Environment.Exit(0);
            
        }





    }
}

Open in new window

0
 
Miguel OzSoftware EngineerCommented:
Here is a sample code that reads your file and convert it to CSV string:
            XElement doc = XElement.Load(@"f:\temp\C--OAOrganization-File.xml");
            string csv = (from el in doc.Descendants()
                          let ns = String.Format("{{{0}}}",el.Name.NamespaceName)
                          where el.Name.LocalName == "Organization"
                          select String.Format("{0},{1},{2},{3}",
                          (string)el.Element(ns + "OrganizationId"),
                          (string)el.Element(ns + "AdminStatus").Attribute("effectiveFrom"),
                          (string)el.Element(ns + "AdminStatus"),
                          Environment.NewLine
                          )
                          )
                          .Aggregate( new StringBuilder(),  (sb, s) => sb.Append(s), sb => sb.ToString()
                          );
            Console.WriteLine(csv);

Open in new window


The sample code contains how to read element value (OrganizationId) and an attribute value(effectiveFrom). It should be easy for you to add the extra info.
The key difference from MSDN example is that you need the namespace string (ns) to address all elements.
0
 
AlHal2Author Commented:
That's great.  Thanks.
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Featured Post

Cloud Class® Course: Python 3 Fundamentals

This course will teach participants about installing and configuring Python, syntax, importing, statements, types, strings, booleans, files, lists, tuples, comprehensions, functions, and classes.

  • 5
  • 3
Tackle projects and never again get stuck behind a technical roadblock.
Join Now