Solved

Split XML file up

Posted on 2016-08-24
2
50 Views
Last Modified: 2016-09-12
I have a large XML file Im trying to read into MySQL which works, but when I get to a few thousand records Im getting a fast-cgi error. Apparently according to my host I cannot change this value, so I have to import the XML document in under 60 seconds.

So Im wondering if its possible to split an XML file up into 100 records and then I'll call each 100 record batch independently until the end. My problem is how to easy split the XML file up.

My current idea is to read each line of the XML file until I find the </record> tag then count to 100 and then save them into a file, then carry on for the next 100 </record>. By doing it this way Im thinking by reading the file one line at a time might also reduce the memory usage as some of the XML files are massive.

Can anyone suggest another way of doing this, or is this going to be the best way?
0
Comment
Question by:tonelm54
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 2
2 Comments
 
LVL 110

Accepted Solution

by:
Ray Paseur earned 500 total points (awarded by participants)
ID: 41769365
Yes, you can split an XML file, but whether such a process will work well, or easily, is a data-dependent question.  Please post an example of one of the XML files, or a link to one of the XML files and we can try to show you how to parse the file.  XML files do not really come in "lines" because whitespace (EOL characters, tabs, blanks, etc) outside the tags and data is not part of the standard.  It's often omitted to make the XML document smaller.  A multi-line XML document is easier for humans to read, but we can't depend on that sort of structure when we're writing code to handle the XML.

If your data provider offers the option of JSON, the file will be somewhat smaller.

In any case, processing large files is not something HTTP requests and PHP scripts were made for, so the best solution may lie in the direction of requesting several smaller files, instead of one large file.

If this is a file that comes from one of your own applications, consider building a file chain -- a collection of files with a signal tag that says whether the end-of-file has been reached.  The signal can say "end-of-file" or it can say the URL of the next XML document in the chain.  Then the PHP script can request each file in succession and process them one-at-a-time, until the end of file has been reached.
0
 
LVL 110

Expert Comment

by:Ray Paseur
ID: 41793917
No response to request for test data, but the theory and practice of a solution is explained fully.
0

Featured Post

On Demand Webinar - Networking for the Cloud Era

This webinar discusses:
-Common barriers companies experience when moving to the cloud
-How SD-WAN changes the way we look at networks
-Best practices customers should employ moving forward with cloud migration
-What happens behind the scenes of SteelConnect’s one-click button

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
can i read my emails on lamp ftp 4 70
How do I Check for duplicate entries in mysql 15 50
Echo values after a query in php 5 51
Testrail - Active Directory integration. 4 35
Author Note: Since this E-E article was originally written, years ago, formal testing has come into common use in the world of PHP.  PHPUnit (http://en.wikipedia.org/wiki/PHPUnit) and similar technologies have enjoyed wide adoption, making it possib…
Nothing in an HTTP request can be trusted, including HTTP headers and form data.  A form token is a tool that can be used to guard against request forgeries (CSRF).  This article shows an improved approach to form tokens, making it more difficult to…
Learn how to match and substitute tagged data using PHP regular expressions. Demonstrated on Windows 7, but also applies to other operating systems. Demonstrated technique applies to PHP (all versions) and Firefox, but very similar techniques will w…
The viewer will learn how to dynamically set the form action using jQuery.

738 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question