Solved

convert xml file to csv

Posted on 2015-01-20
8
162 Views
Last Modified: 2015-01-26
I have some Excel files with elements and attributes all of different values.
I would like to have Elements and attributes as column headings.  Where attributes are used as headings the element name should proceed the attribute name with an _ separating them.
The value of the element should have the element name followed by the word "value in the header as in AdminStatus_Value in the example below.

For example

<OrganizationId>4295856130</OrganizationId><AdminStatus effectiveFrom="2009-07-21T17:02:06">Published</AdminStatus><OrganizationName effectiveFrom="2005-08-18T04:00:00" organizationNameTypeCode="AKA" languageId="505062" organizationNameLocalNormalized="TX" sourceType="11" sourceLink="http://www.nyse.com/about/listed/tx.html" sourceUpdateDate="2012-04-03T07:22:36">TX</OrganizationName>

should have as headings:
OrganizationID,
AdminStatus_EffectiveFrom,
AdminStatusValue
OrganizationName_EffectiveFrom,
OrganizationName_organizationNameTypeCode,
OrganizationName_LanguageID,
OrganizationName_organizationNameLocalNormalized,
OrganizationName_SourceType
OrganizationName_SourceLink
OrganizationName_SourceUpdateDate
0
Comment
Question by:AlHal2
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 5
  • 3
8 Comments
 
LVL 36

Expert Comment

by:Miguel Oz
ID: 40560889
For guidance check this M$ link
I do not know the root element of your doco, but to access attributes use el.Attribute("effectiveFrom") or elements el.Element("OrganizationID"),
To save csv string use:
System.IO.File.WriteAllText("F:\\*******\\yourFilename.csv", csv);

For further help. we need an xml sample file to custom the link code but it should be straight forwards to code it.
0
 

Author Comment

by:AlHal2
ID: 40561682
Here is the first part of the file.

<?xml version='1.0' encoding='utf-8'?><env:ContentEnvelope majVers='3' minVers='2.3' pubStyle='Incremental' xmlns:env='http://data.schemas.tfn.thomson.com/Envelope/2008-05-01/' xmlns:cmn='http://data.schemas.financial.thomsonreuters.com/Common/2009-09-01/' xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance' xmlns='http://oa.schemas.tfn.thomson.com/Organization/2010-05-01/' xsi:schemaLocation='http://oa.schemas.tfn.thomson.com/Organization/2010-05-01/ OrganizationMaster.xsd'><env:Header><env:Info><env:Id>urn:uuid:843a1d51-445a-403e-b062-88430a6aff26</env:Id><env:TimeStamp>2015-01-19T13:54:23</env:TimeStamp></env:Info></env:Header><env:Body majVers='3' minVers='3.0' contentSet='OA'>
<env:ContentItem action="Overwrite"><env:Data xsi:type="OrganizationDataItem"><Organization entityCreatedDate="2009-07-21T17:02:06" entityModifiedDate="2015-01-19T05:31:48" isOrganizationVerified="true" isOrganizationManaged="true" isOrganizationFinancialLocalBranch="false" isOrganizationFinancialForeignBranch="false" isOrganizationOtherBranchandDivision="false" entityLastReviewedDate="2014-07-17T04:00:00" organizationTier="1"><OrganizationId>4295856130</OrganizationId><AdminStatus effectiveFrom="2009-07-21T17:02:06">Published</AdminStatus><OrganizationName effectiveFrom="2005-08-18T04:00:00" organizationNameTypeCode="AKA" languageId="505062" organizationNameLocalNormalized="TX"
0
 

Author Comment

by:AlHal2
ID: 40561698
This is the code I'm using.  I get this error message
Object reference not set to an instance of an object.

using (StreamWriter sw = new StreamWriter("C:\\OAOrganization\\OrganizationNameParsed2.txt"))
                {
                    XElement custOrd = XElement.Load("C:\\OAOrganization\\OrganizationName.txt");
                    string csv =
                        (from el in custOrd.Element("Body").Elements("ContentItem").Elements("Data").Elements("Organization").Elements("OrganizationName")
                         select
                             String.Format("{0},{1},{2},{3},{4}",
                                 (string)el.Attribute("effectiveFrom"),
                                 (string)el.Element("effectiveTo"),
                                 (string)el.Element("organizationNameTypeCode"),
                                 (string)el.Element("languageId"),
                             //(string)el.Element("organizationNameLocalNormalized"),
                             //(string)el.Element("FullAddress").Element("Address"),
                             //(string)el.Element("FullAddress").Element("City"),
                             //(string)el.Element("FullAddress").Element("Region"),
                             //(string)el.Element("FullAddress").Element("PostalCode"),
                             //(string)el.Element("FullAddress").Element("Country"),
                                 Environment.NewLine
                             )
                        )
                        .Aggregate(
                            new StringBuilder(),
                            (sb, s) => sb.Append(s),
                            sb => sb.ToString()
                        );
                    sw.WriteLine(csv);


                }

Open in new window

0
Salesforce Has Never Been Easier

Improve and reinforce salesforce training & adoption using WalkMe's digital adoption platform. Start saving on costly employee training by creating fast intuitive Walk-Thrus for Salesforce. Claim your Free Account Now

 
LVL 36

Expert Comment

by:Miguel Oz
ID: 40563026
Could you post a valid xml file (basically that contains a complete node or nodes of what you need)?
0
 

Author Comment

by:AlHal2
ID: 40563980
0
 

Author Comment

by:AlHal2
ID: 40564544
If the above is impossible then I have this code to put each element and attribute on a separate row.  The only thing is it gives the OrganizationID as published which is the value for AdminStatus.
using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Data;
using System.Drawing;
using System.Linq;
using System.Text;
using System.Windows.Forms;
using System.Xml;
using System.IO;
using System.Xml.Linq;

namespace ParseXML
{

    public partial class Form1 : Form
    {
        string FileText = "";
        int CharCt;
        string NodeName="";
        string OrgID;
        public Form1()
        {
            InitializeComponent();
        }

        private void Form1_Load(object sender, EventArgs e)
        {
            {
                string xmlNode = "";
                using (StreamReader sr = File.OpenText("C:\\OAOrganization\\File.txt"))
                {
                    using (StreamWriter sw = new StreamWriter("C:\\OAOrganization\\OrganizationNameParsed.txt"))
                    {
                        try
                        {
                            while (sr.Peek() >= 0)
                            {


                                xmlNode = sr.ReadToEnd();
                            }
                        }
                        catch (Exception ex)
                        {
                            lblErrs.Text = ex.Message;
                        }


                        XmlReader xReader = XmlReader.Create(new StringReader(xmlNode));
                        while (xReader.Read())
                        {
                            

                            switch (xReader.NodeType)
                            {

                                case XmlNodeType.Element:
                                    NodeName = "";
                                    //OrgID = "";
                                    if (xReader.Name.ToUpper() != "ENV:HEADER"
                                        && xReader.Name.ToUpper() != "ENV:CONTENTENVELOPE"
                                        && xReader.Name.ToUpper() != "ENV:ID"
                                        && xReader.Name.ToUpper() != "ENV:INFO"
                                        && xReader.Name.ToUpper() != "ENV:TIMESTAMP"
                                        && xReader.Name.ToUpper() != "ENV:BODY"
                                        && xReader.Name.ToUpper() != "ENV:CONTENTITEM"
                                        && xReader.Name.ToUpper() != "ENV:DATA"
                                        && xReader.Name.ToUpper() != "ORGANIZATION"
                                        && xReader.Name.ToUpper().Contains("ADMIN") == false
                                        && xReader.Name.ToUpper().Contains("ISPUBLIC") == false
                                        && xReader.Name.ToUpper().Contains("IPODATE") == false
                                        && xReader.Name.ToUpper().Contains("ADDRESS") == false
                                        && xReader.Name.ToUpper().Contains("PHONE") == false
                                        && xReader.Name.ToUpper().Contains("FOUNDED") == false
                                        && xReader.Name.ToUpper().Contains("JURISDICTION") == false
                                        && xReader.Name.ToUpper().Contains("SUBTYPE") == false
                                        && xReader.Name.ToUpper().Contains("PROVIDER") == false)
                                    {
                                        //sw.Write("<" + xReader.Name + ">|");

                                        //sw.Write("<" + xReader.Name + ">|" + xReader.ReadElementContentAsString() + "|</"+ xReader.Name + ">\r\n");
                                        
                                        //sw.Write(xReader.Name + "|" + xReader.Value + "|" + xReader.Name + "\r\n");
                                        NodeName = xReader.Name ;
                                        if (xReader.Name == "OrganizationId")
                                        {
                                            OrgID = xReader.ReadElementContentAsString();
                                        }
                                        for (int attInd = 0; attInd < xReader.AttributeCount; attInd++)
                                        {

                                            if (xReader.Name.ToUpper().Contains("ADDRESS") == false
                                        && xReader.Name.ToUpper().Contains("EFFECTIVE") == false
                                        && xReader.Name.ToUpper().Contains("ADMIN") == false
                                        && xReader.Name.ToUpper().Contains("SOURCE") == false
                                        && xReader.Name.ToUpper().Contains("LANGUAGE") == false
                                        && xReader.Name.ToUpper().Contains("ISPUBLIC") == false
                                        && xReader.Name.ToUpper().Contains("IPODATE") == false)
                                            
                                            {
                                                if (xReader.Value != "")
                                                {
                                                    sw.Write(OrgID + "|" + NodeName + "_" + xReader.Name + "|" + xReader.Value + "|" + NodeName + "_" + xReader.Name + "\r\n");
                                                    

                                                }

                                            }
                                            xReader.MoveToAttribute(attInd);
                                            //NodeName = xReader.Name;
                                        }
                                        
                                   
                                    }
                                    
                                    break;
                                case XmlNodeType.Text:
                                    //NodeName = xReader.Name;
                                    if (NodeName != "")
                                    {
                                        //xReader.
                                        sw.Write(OrgID + "|" + NodeName + "|" + xReader.Value  + "|" + NodeName);
                                        sw.Write("\r\n");
                                    }
                                    break;
                                case XmlNodeType.EndElement:
                                   
                                    break;
                            
                            }


                        }
                    }
                }
                using (StreamReader sr = File.OpenText("C:\\OAOrganization\\OrganizationNameParsed.txt"))
                {
                    using (StreamWriter sw = new StreamWriter("C:\\OAOrganization\\OrganizationNameParsed2.txt"))
                    {
                        try
                        {
                            while (sr.Peek() >= 0)
                            {


                                FileText = sr.ReadLine();
                                CharCt = FileText.IndexOf("<Organ");
                                if (CharCt > 0)
                                {
                                    FileText = FileText.Substring(CharCt, FileText.Length - CharCt);
                                    sw.Write(FileText);
                                    sw.Write("\r\n");
                                }
                            }
                        }
                        catch (Exception ex)
                        {
                            lblErrs.Text = ex.Message;
                        }
                    }
                }

                


            }
            Environment.Exit(0);
            
        }





    }
}

Open in new window

0
 
LVL 36

Accepted Solution

by:
Miguel Oz earned 500 total points
ID: 40568776
Here is a sample code that reads your file and convert it to CSV string:
            XElement doc = XElement.Load(@"f:\temp\C--OAOrganization-File.xml");
            string csv = (from el in doc.Descendants()
                          let ns = String.Format("{{{0}}}",el.Name.NamespaceName)
                          where el.Name.LocalName == "Organization"
                          select String.Format("{0},{1},{2},{3}",
                          (string)el.Element(ns + "OrganizationId"),
                          (string)el.Element(ns + "AdminStatus").Attribute("effectiveFrom"),
                          (string)el.Element(ns + "AdminStatus"),
                          Environment.NewLine
                          )
                          )
                          .Aggregate( new StringBuilder(),  (sb, s) => sb.Append(s), sb => sb.ToString()
                          );
            Console.WriteLine(csv);

Open in new window


The sample code contains how to read element value (OrganizationId) and an attribute value(effectiveFrom). It should be easy for you to add the extra info.
The key difference from MSDN example is that you need the namespace string (ns) to address all elements.
0
 

Author Closing Comment

by:AlHal2
ID: 40570236
That's great.  Thanks.
0

Featured Post

PeopleSoft Has Never Been Easier

PeopleSoft Adoption Made Smooth & Simple!

On-The-Job Training Is made Intuitive & Easy With WalkMe's On-Screen Guidance Tool.  Claim Your Free WalkMe Account Now

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Exception Handling is in the core of any application that is able to dignify its name. In this article, I'll guide you through the process of writing a DRY (Don't Repeat Yourself) Exception Handling mechanism, using Aspect Oriented Programming.
Calculating holidays and working days is a function that is often needed yet it is not one found within the Framework. This article presents one approach to building a working-day calculator for use in .NET.
In this brief tutorial Pawel from AdRem Software explains how you can quickly find out which services are running on your network, or what are the IP addresses of servers responsible for each service. Software used is freeware NetCrunch Tools (https…
If you’ve ever visited a web page and noticed a cool font that you really liked the look of, but couldn’t figure out which font it was so that you could use it for your own work, then this video is for you! In this Micro Tutorial, you'll learn yo…

728 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question