Still celebrating National IT Professionals Day with 3 months of free Premium Membership. Use Code ITDAY17

x
?
Solved

Handling large XML files (>50MB) in ASP...

Posted on 2004-09-18
10
Medium Priority
?
233 Views
Last Modified: 2010-05-18
Hi all,

I have a problem that are beginning to annoy me beyond reason and besides I am running out of time and start to get desperat :-)

For my website (www.kjaerland.dk/DVD) I download and decode several CSV and XML files daily for insertion into my MySQL database. However some of the XML files I need to get a hold of starts to get very large - in the area of 45MB and above which means that I start having problems loading them.

One of my limitations are that I have my site hosted externally (Microsoft server and MySQL 4.0) so I am not able to alter server settings to fit my needs.

I have this bit of sample code that I use for simple testing:

    set objXML = Server.CreateObject("Msxml2.DomDocument")
    objXML.async = false
    objXML.setProperty "ServerHTTPRequest", true
    objXML.load("http://www. .... music_DK.xml")

    Response.Write "<br><br><strong>" & objXML.parseError.reason & "</strong><br><br>"

    Set NodeList = objXML.selectNodes("/cdon_products/countries/country/product")
    Response.Write NodeList.length & "<br>"

    set objXML = nothing

It works fine on small XML files but when the size gets large I get an "Not enough storage is available to complete this operation." error and nothing gets loaded. I am not at all an XML expert so I might be overlooking something.

The only processing I need to do on the XML file is to get a few fields per product in the list and then insert it into my MySQL database. Nothing needs to be displayed.

Does anyone have a good idea of how to get around this "memory" limitation and get to those XML data that I so badly need inserted into my database. I have been playing a small bit with loading the XML with async set to true but didn't get close to that working and immediatly got XSL messed into it (and then it started to go beyond my limited knowledge of the XML world)...

Regards,

Thomas Kjaer
0
Comment
Question by:kjaer
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 4
  • 2
  • 2
10 Comments
 
LVL 15

Expert Comment

by:dualsoul
ID: 12094681
it's starange that you have such errors, 45Mb is not very large for XML processing.

if you only needs get data from XML, you can use SAX API to do that - it will not create DOM tree in-memory and there will be no memory consumption.
0
 
LVL 15

Expert Comment

by:dualsoul
ID: 12094685
refer to MSDN documentation of SAX2 implementation in MSXML.
0
 

Author Comment

by:kjaer
ID: 12096544
SAX sound more right - and a bit more complex - eventhough I have not yet found a usefull example. I did however find this comment:

"However, the MS XML SAX implementation requires the installation and registration of user-supplied COM objects on the web server. This is because the SAX parsing engine uses call-backs into the user-supplied COM objects."

Doesn't that mean I will get into new problems since host my site externally or isn't "installation and registration of user-supplied COM objects" as physical as it sounds?

Anyhow I will chase this a bit more... (and I'll write my webhost to figure out why the memory limit is so low. I get a similar error if I try to get the file using a aspHTTP component)

- Thomas Kjaer
0
Learn by Doing. Anytime. Anywhere.

Do you like to learn by doing?
Our labs and exercises give you the chance to do just that: Learn by performing actions on real environments.

Hands-on, scenario-based labs give you experience on real environments provided by us so you don't have to worry about breaking anything.

 
LVL 21

Accepted Solution

by:
MogalManic earned 1000 total points
ID: 12098384
Part of the problem is that using your algorithm your 45MB document might almost double the storage in memory becouse you are loading the data into memory twice.  The XML document is loaded into 'objXML' and then the data is loaded into the NodeList variable.

If you don't want to get into sax, then XSL might also work.  This is a simple XSL that turns a the data into an html table:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:param name='sortColumn'>CONSOLEID</xsl:param>
  <xsl:template match="/">
    <html>
      <head></head>
      <body>
         <table>
                 <tr>
                        <th>Album</th>
                        <th>Artist</th>
                        <th>Record Label</th>
                        <th>Price</th>
                 </tr>
           <xsl:for-each select="/cdon_products/countries/country/product">
                 <tr>
                        <td><xsl:value-of select="album"/></td>
                        <td><xsl:value-of select="artist"/></td>
                        <td><xsl:value-of select="record_label"/></td>
                        <td><xsl:value-of select="price"/></td>
                 </tr>
           </xsl:for-each>
     </body>
    </html>
  </xsl:template>
 
</xsl:stylesheet>
0
 

Author Comment

by:kjaer
ID: 12099399
Point taken MogalManic. However it's the first load that's failing, so before I get that working I can't do anything :-)
0
 
LVL 15

Expert Comment

by:dualsoul
ID: 12099695
Hi MogalManic.
> then XSL might also work.

it's not right, if there are really problem with memory, and 45Mb file proceesing using DOM causes out-of-memory than XSLT won't work either. Beacause all XSLT engine builds in-memory DOM like structure of input XML, so...memory consumption at least the same.


0
 
LVL 15

Assisted Solution

by:dualsoul
dualsoul earned 1000 total points
ID: 12099700
>installation and registration of user-supplied COM objects

with SAX approach you need to implement your COM objects which will listen to SAX events, so you need to register them in target system as usuall (it's COM you know :))  - nothing more, just ordinary registration with registry.
0
 
LVL 21

Expert Comment

by:MogalManic
ID: 12118657
You might try XQuery.  This is a XML query language (some of the same functionality as XSLT but not as popular yet).  I don't know how well it works, but it might not load the whole XML document into memory if you are only processing it a little at a time???

http://aspnet.4guysfromrolla.com/articles/071603-1.aspx
0

Featured Post

What is a Denial of Service (DoS)?

A DoS is a malicious attempt to prevent the normal operation of a computer system. You may frequently see the terms 'DDoS' (Distributed Denial of Service) and 'DoS' used interchangeably, but there are some subtle differences.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

I was working on a PowerPoint add-in the other day and a client asked me "can you implement a feature which processes a chart when it's pasted into a slide from another deck?". It got me wondering how to hook into built-in ribbon events in Office.
Many times as a report developer I've been asked to display normalized data such as three rows with values Jack, Joe, and Bob as a single comma-separated string such as 'Jack, Joe, Bob', and vice versa.  Here's how to do it. 
This course is ideal for IT System Administrators working with VMware vSphere and its associated products in their company infrastructure. This course teaches you how to install and maintain this virtualization technology to store data, prevent vuln…
This is my first video review of Microsoft Bookings, I will be doing a part two with a bit more information, but wanted to get this out to you folks.

715 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question