Parsing v. large nested xml file.

I’m using the attached c# code to parse the attached xml file.  The problem is that within each organization tag there are multiple organizationName tags.  This is because organizations change their names over time.  The code is only getting the most recent whereas I would like all the history.
Ideally, I’d also like to be able to filter on specific elements and/or attributes.  However this is lower priority as I can just iterate through the file created by the code.
C--OAOrganization-File2.txtC--OAOrganization-SourceCode.txt
AlHal2Asked:
Who is Participating?
 
AlHal2Connect With a Mentor Author Commented:
This code from a colleague worked for me.

using System;
using System.Xml;

namespace ReadXMLfromFile
{
    /// <summary>
    
    /// </summary>
    class pdaXMLParser
    {
        static void Main(string[] args)
        {
            XmlTextReader reader = new XmlTextReader("c:\\temp\\file2.xml");
            string csvRoot = "";
            string sep = "|";

            //write the output file header string (overwrite any existing file)
            using (System.IO.StreamWriter file = new System.IO.StreamWriter(@"C:\temp\OrganizationNameParsed.txt"))
            {
                file.WriteLine("OrganizationID|entityCreatedDate|entityModifiedDate|OrganizationName|OrganizationName_effectiveFrom|OrganizationName_effectiveTo|OrganizationName_organizationNameTypeCode|organizationName_LanguageID|OrganizationName_organizationNameLocalNormalized");
            }

            while (reader.Read())
            {
                // Only detect start elements.
		        if (reader.IsStartElement())
		        {
		            // Get element name and switch on it.
		            switch (reader.Name)
		            {
			        case "Data":
			            // Detect this element.
			            //Console.WriteLine("Start <data> element.");
			            break;

                    case "Organization":
                        //start a new csv string for use later... 
                        csvRoot = "";

                        // Detect the Organization element and extract the required attributes
                        string attribute = reader["entityCreatedDate"];            
			            if (attribute != null)
			            {
                            csvRoot += attribute;
			            }
                        else { csvRoot += sep; }

                        attribute = reader["entityModifiedDate"];
			            if (attribute != null)
			            {
                            //Console.WriteLine("  entityModifiedDate: " + attribute);
                            csvRoot += sep + attribute;
			            }
                        else { csvRoot += sep; }

			            break;

                    case "OrganizationId":
                        // Detect the Organization element and extract the required data from the next record
                        if (reader.Read())
                        {
                        //Console.WriteLine("  Organization ID: " + reader.Value.Trim());
                        //prefix the root data with the OrgID
                        csvRoot = reader.Value.Trim()+sep + csvRoot;
                        }
                        else { csvRoot = sep + csvRoot; }

                        break;

                    case "OrganizationName":
                        // Detect the Organization element and extract the required data from the attributes
                        
                        //reset the details field as there may be >1 Organization name per OrganizationID
                        string csvNameDetails = "";

                        attribute = reader["effectiveFrom"];
			            if (attribute != null)
			            {
                            //Console.WriteLine("  effectiveFrom: " + attribute);
                            csvNameDetails += sep + attribute;
			            }
                        else { csvNameDetails += sep; }
        
                        attribute = reader["effectiveTo"];
                        if (attribute != null)
                        {
                            //Console.WriteLine("  effectiveTo: " + attribute);
                            csvNameDetails += sep + attribute;
                        }
                        else
                        { csvNameDetails += sep; }

                        attribute = reader["organizationNameTypeCode"];
			            if (attribute != null)
			            {
                            //Console.WriteLine("  organizationNameTypeCode: " + attribute);
                            csvNameDetails += sep + attribute;
			            }
                        else
                        { csvNameDetails += sep; }

                        attribute = reader["languageId"];
                        if (attribute != null)
                        {
                            //Console.WriteLine("  languageId: " + attribute);
                            csvNameDetails += sep + attribute;
                        }
                        else
                        { csvNameDetails += sep; }

                        attribute = reader["organizationNameLocalNormalized"];
                        if (attribute != null)
                        {
                            //Console.WriteLine("  organizationNameLocalNormalized: " + attribute);
                            csvNameDetails += sep + attribute;
                        }
                        else
                        { csvNameDetails += sep; }

                        // read ahead to get the Organization Name text and prefix this to the attribute data
                        if (reader.Read())
                        {
                            //Console.WriteLine("  Organization ID: " + reader.Value.Trim());
                            csvNameDetails = reader.Value.Trim() + csvNameDetails ;
                        }
                        else { csvNameDetails += sep; }

                        //write the root details for the OrganizationID along with the current OrganizationName details
                       // Console.WriteLine(csvRoot+sep+csvNameDetails);
                        
                        //write the output file data string (append to an existing file)
                        using (System.IO.StreamWriter file = new System.IO.StreamWriter(@"C:\temp\OrganizationNameParsed.txt",true))
                        {
                            file.WriteLine(csvRoot + sep+ csvNameDetails);
                        }
                        break;

		            }
		        }
	    
                }
            Console.ReadLine();
        }
    }
}

Open in new window

0
 
Fernando SotoRetiredCommented:
Hi AlHal2;

Is something like this that you are looking for?
// Load document into memory
XDocument xdoc = XDocument.Load(@"Path to XML File\C--OAOrganization-File2.xml");

// XML NameSpace used in documents
XNamespace ns = xdoc.Root.GetDefaultNamespace();
XNamespace env = xdoc.Root.GetNamespaceOfPrefix("env");

// Query for needed information
var results = (from o in xdoc.Descendants(ns + "Organization")
               from orgn in o.Elements(ns + "OrganizationName") 
               select new
               {
                   Id = o.Element(ns + "OrganizationId").Value,
                   Name = orgn.Value
               });
               
Console.WriteLine("Id                  Name");               
foreach (var org in results)
{
    Console.WriteLine("{0}   {1}", org.Id, org.Name);
}               

Open in new window

Result of above Linq query
Id           Name
4295904866   S. Y. BANCORP, INC.
4295904866   STOCK YARDS BANCORP, INC.
4295904866   Stock Yards
4295904882   SAUL CENTERS, INC.
4295904882   Saul Centers
4295904889   SCHUFF STEEL CO
4295904889   SCHUFF INTERNATIONAL, INC.
4295904889   Schuff Intl

Open in new window

0
 
AlHal2Author Commented:
The advantage of the cc# code is that it parses the file bit by bit.  If I ingest the entire 8GB file into memory the program will not run.
0
 
AlHal2Author Commented:
it works.
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.