Solved

Splitting a large XML file into smaller ones

Posted on 2004-04-10
5
394 Views
Last Modified: 2008-03-10
I have a large XML similar to the one below:

<BookStore>
<Book>book1</Book>
<Book>book2</Book>
.....
<Book>book1000</Book>
<Book>book1001</Book>
......
</BookStore>

The <Book> element may occur in 1000's. I want to create several smaller XML file each containing say only 250 <Book> elements. As far as I know, the two ways it could be done is either using XSLT or loading the entire XML into a DataSet and then create smaller subset DataSets. Since I am not an expert on either of them, any help is appreciated
0
Comment
Question by:nadarajan
5 Comments
 
LVL 15

Accepted Solution

by:
dualsoul earned 250 total points
ID: 10798331
It can be done in a number of ways.

1. You can do it without xml processing at all. Just use regular exprssions to get <Book>....</Book>  string and output them to different files by 250 per file.

2. You can do it thourgh SAX or XmlReader interfaces. Read portions of 250 Book elements and output them to different files. For me, i'll prefer SAX solution.

3. You can use DOM model, and traverse through Book elements saving them to file by 250. Easy to write this one :)

4. You can write XSLT to output to different files, something like this:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
      <xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
      
      <xsl:template match="/">
            <xsl:for-each select="BookStore/Book">
                  <xsl:if test="position() mod 250=0">
                        <!-- change output file name here -->                  
                  </xsl:if>
                  
                  <xsl:copy-of select="."/>
            </xsl:for-each>
      </xsl:template>
</xsl:stylesheet>

, but the mechanism to change output file depends on particular XSLT processor, so if tell us what's yout XSLT processor....we can help you more.

So, you see, there are number of options :)
0
 
LVL 6

Assisted Solution

by:metalmickey
metalmickey earned 250 total points
ID: 10805650
Dual soul fixed summut for me recently that divides xml docs

http://www-106.ibm.com/developerworks/xml/library/x-tipdivbig/

i think it uses a java class....

anyway heres my roginal prob and  Dualsouls solution

http://oldlook.experts-exchange.com:8080/Web/Web_Languages/XML/Q_20908134.html

No points please

MM
0

Featured Post

Resolve Critical IT Incidents Fast

If your data, services or processes become compromised, your organization can suffer damage in just minutes and how fast you communicate during a major IT incident is everything. Learn how to immediately identify incidents & best practices to resolve them quickly and effectively.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Browsing the questions asked to the Experts of this forum, you will be amazed to see how many times people are headaching about monster regular expressions (regex) to select that specific part of some HTML or XML file they want to extract. The examp…
I was working on a PowerPoint add-in the other day and a client asked me "can you implement a feature which processes a chart when it's pasted into a slide from another deck?". It got me wondering how to hook into built-in ribbon events in Office.
Although Jacob Bernoulli (1654-1705) has been credited as the creator of "Binomial Distribution Table", Gottfried Leibniz (1646-1716) did his dissertation on the subject in 1666; Leibniz you may recall is the co-inventor of "Calculus" and beat Isaac…
A short tutorial showing how to set up an email signature in Outlook on the Web (previously known as OWA). For free email signatures designs, visit https://www.mail-signatures.com/articles/signature-templates/?sts=6651 If you want to manage em…

831 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question