Convert xml to csv 3

Please see

http://www.experts-exchange.com/Programming/Languages/C_Sharp/Q_28603967.html
http://www.experts-exchange.com/Programming/Languages/C_Sharp/Q_28600021.html

The file is very large, so these methods are killing the memory even with a filter.  Any suggestions?
Also, I'd like to put some OrganizationIDs into a text file and have the program filter the output based on that text file.
AlHal2Asked:
Who is Participating?

[Webinar] Streamline your web hosting managementRegister Today

x
 
Miguel OzConnect With a Mentor Software EngineerCommented:
I do not think SQL can help you because you are adding extra resources to finish your task.

To load XML file partially into memory but processing the file node by node, you could use the following MSDN suggestion

Basically you load only the organization nodes one by one using this method:
        static IEnumerable<XElement> SimpleStreamAxis(
                       string filename, string matchName)
        {
            using (XmlTextReader reader =  new XmlTextReader(filename))
            {
                reader.MoveToContent();
                while (reader.Read())
                {
                    switch (reader.NodeType)
                    {
                        case XmlNodeType.Element:
                            if (reader.LocalName == matchName)
                            {
                                XElement el = XElement.ReadFrom(reader)
                                                      as XElement;
                                if (el != null)
                                    yield return el;
                            }
                            break;
                    }
                }
                reader.Close();
            }
        }

Open in new window

Then in the query code replace the following
XElement doc = XElement.Load(@"f:\temp\C--OAOrganization-File.xml");
string csv = (from el in doc.Descendants()
                          where el.Name.LocalName == "Organization"

Open in new window

with:
string csv = (from el in SimpleStreamAxis(@"f:\temp\C--OAOrganization-File.xml", "Organization")

Open in new window


The code above is replacing the doc instance and where condition in your original code.
0
 
Fernando SotoRetiredCommented:
Hi AlHal2;

I made a couple of changes to the code snippet so that you can filter on more then one origanizationID at a time. In the below code snippet I used one orginazation ID per line and read them into memory into an array. If your file is formatted differently you will need to extract the ID's into an array or into a List. Also in the code snippet you will need to change the file path and name to meet your needs.

// The values to filter on
// The File OrigIds.txt in this case ontains one OriginazationId per line
// If you have a different format in the file you will need to extract the ID's
// so that you have one ID per element of the array or List<>
string[] origIDs = File.ReadAllLines(@"C:\Working Directory\OrigIds.txt");
string typeName = "AKA";
string effectiveTo = "2005-08-18T04:00:00";

XElement doc = XElement.Load(@"C:\Working Directory\OAOrganization-File.xml");
string csv = (from el in doc.Descendants()
              let ns = String.Format("{{{0}}}",el.Name.NamespaceName)
              where  el.Name.LocalName == "Organization" && ((origIDs.Contains(el.Element(ns + "OrganizationId").Value)) || 
                    (el.Element(ns + "OrganizationName").Attribute("organizationNameTypeCode").Value == typeName) ||
                    (el.Element(ns + "OrganizationName").Attribute("effectiveTo").Value == effectiveTo))
              select String.Format("{0},{1},{2},{3}",
              (string)el.Element(ns + "OrganizationId"),
              (string)el.Element(ns + "AdminStatus").Attribute("effectiveFrom"),
              (string)el.Element(ns + "AdminStatus"),
              Environment.NewLine
              )
              )
              .Aggregate( new StringBuilder(),  (sb, s) => sb.Append(s), sb => sb.ToString()
              );

Open in new window

0
 
Fernando SotoRetiredCommented:
Also can you please explain what you mean by this statement, "The file is very large, so these methods are killing the memory even with a filter"?
0
The new generation of project management tools

With monday.com’s project management tool, you can see what everyone on your team is working in a single glance. Its intuitive dashboards are customizable, so you can create systems that work for you.

 
AlHal2Author Commented:
Thanks for this.
I mean the program goes through a 30mb file in seconds, but I leave an 8gb file for over an hour. The memory usage is enormous.
0
 
AlHal2Author Commented:
I think the program treats the file like one long string.
0
 
Fernando SotoRetiredCommented:
The file is being loaded, all 8 GB, into memory in order for the query to operate on it. If there is not enough memory some of it will need to be off loaded into virtual memory and will cause longer run time do those parts that were off loaded need to be reloaded. If this file continues to grow the situation will only get worse.
0
 
AlHal2Author Commented:
Would you be able to suggest some SQL to ingest the file into an SQL Server database?
I'm open to any other suggestions.
0
 
Fernando SotoRetiredCommented:
Storing the data on a SQL database would help seeming that the database would work on is tables and only returns the needed information. The issue now is to get the data into tables into the database and I don't know of any program available to do this directly from your XML.
0
 
AlHal2Author Commented:
Thanks.
0
All Courses

From novice to tech pro — start learning today.