Solved

Splitting a large XML file

Posted on 2004-04-10
4
252 Views
Last Modified: 2010-04-15
I have a large XML similar to the one below:

<BookStore>
<Book>book1</Book>
<Book>book2</Book>
.....
<Book>book1000</Book>
<Book>book1001</Book>
......
</BookStore>

The <Book> element may occur in 1000's. I want to create several smaller XML files each containing say only 250 <Book> elements. As far as I know, the two ways it could be done is either using XSLT or loading the entire XML into a DataSet and then create smaller subset DataSets. Since I am not an expert on either of them, any help is appreciated
0
Comment
Question by:nadarajan
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
4 Comments
 
LVL 48

Accepted Solution

by:
AlexFM earned 250 total points
ID: 10800685
This console application gets XML file name as command line parameter and creates number of output XML files with names C:\1.xml, C:\2.xml ... It is very basic and doesn't contain exception handling, but it contains all required code. Create new C# console application and paste this code to it:

using System;
using System.Xml;
using System.Text;


namespace XMLSplit
{
    class Class1
    {
        [STAThread]
        static void Main(string[] args)
        {
            if ( args.GetLength(0) == 1 )
                SplitXMLFile(args[0], 250);
        }

        static void SplitXMLFile(String sFileName, int nNodesInFile)
        {
            XmlDocument document = new XmlDocument();
            document.Load(sFileName);

            XmlNodeList nodes = document.GetElementsByTagName("Book");

            int nFiles = (nodes.Count + nNodesInFile - 1)/nNodesInFile;

            for ( int i = 0; i < nFiles; i++ )
            {
                int nStart = i*nNodesInFile;
                int nEnd = (i+1)*nNodesInFile - 1;
                if ( nEnd > nodes.Count - 1 )
                    nEnd = nodes.Count - 1;

                WriteOutputFile(i+1, nodes, nStart, nEnd);
            }
        }

        static void WriteOutputFile(int nFileNumber,
            XmlNodeList nodes, int nStart, int nEnd)
        {
            XmlDocument doc = new XmlDocument();

            StringBuilder s = new StringBuilder();
            s.Append("<?xml version=\"1.0\"?>\n");
            s.Append("<BookStore>\n");
            s.Append("</BookStore>");

            doc.LoadXml(s.ToString());

            for ( int i = nStart; i <= nEnd; i++ )
            {
                XmlElement bookElement = doc.CreateElement("Book");
                bookElement.InnerText = nodes[i].InnerText;

                doc.DocumentElement.AppendChild(bookElement);
            }

            doc.Save(String.Format("C:\\{0}.xml", nFileNumber));
        }
    }
}
0
 
LVL 10

Assisted Solution

by:ptmcomp
ptmcomp earned 250 total points
ID: 10805102
If the code above consumes too much memory you should consider using XPath and XmlReader and XmlWriter.
0

Featured Post

Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Introduction Hi all and welcome to my first article on Experts Exchange. A while ago, someone asked me if i could do some tutorials on object oriented programming. I decided to do them on C#. Now you may ask me, why's that? Well, one of the re…
The article shows the basic steps of integrating an HTML theme template into an ASP.NET MVC project
Come and listen to Percona CEO Peter Zaitsev discuss what’s new in Percona open source software, including Percona Server for MySQL (https://www.percona.com/software/mysql-database/percona-server) and MongoDB (https://www.percona.com/software/mongo-…
Monitoring a network: why having a policy is the best policy? Michael Kulchisky, MCSE, MCSA, MCP, VTSP, VSP, CCSP outlines the enormous benefits of having a policy-based approach when monitoring medium and large networks. Software utilized in this v…

691 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question