Link to home
Start Free TrialLog in
Avatar of paulott
paulott

asked on

Fast XML parsing with VB 6.0

Hello.

I'm using Visual Basic 6.0, with a project reference to MS XML v3.0.

I'm new to using VB's parser with XML and would appreciate some help speeding up the following code.

The XML data is roughly in this format:

______________________________

<?xml version="1.0" encoding="UTF-8" ?>
<Outside>
<Outer>
   <Title>Information</Title>
   <Title2>Galore</Title2>
   <Middle>
       <Inner>
          <Field1>Some</Field1>
          <Field2>Info</Field2>
          <Field3>Here</Field3>
       </Inner>
       <Inner>
          <Field1>And</Field1>
          <Field3>More</Field3>
       </Inner>
   </Middle>
</Outer>
<Outer>
   <Title>Part II</Title>
   <Middle>
       <Inner>
          <Field2>Data</Field2>
          <Field4>All</Field3>
       </Inner>
       <Inner>
          <Field2>Over</Field2>
          <Field3>The Place</Field3>
       </Inner>
   </Middle>
</Outer>
</Outside>
______________________________

Currently, my program is as follows:
______________________________

  Dim articleDoc As New DOMDocument, outerNode As IXMLDOMNode
  Dim innerNode As IXMLDOMNode, tempNode As IXMLDOMNode
  articleDoc.preserveWhiteSpace = False
  articleDoc.validateOnParse = False
  articleDoc.resolveExternals = False
  'Load the data into the XML parser
  articleDoc.loadXML (strData)
  For Each outerNode In articleDoc.selectNodes("Outside/Outer")
    Set tempNode = outerNode.selectSingleNode("Title")
    If Not (tempNode Is Nothing) Then
      strTitle = tempNode.Text
    Else
      strTitle = ""
    End If
    'We iterate through each Inner
    For Each innerNode In outerNode.selectNodes("Middle/Inner")
        'Pull out the information
        Set tempNode = innerNode.selectSingleNode("Field1")
        If Not (tempNode Is Nothing) Then
          strField1 = "" & tempNode.Text
        Else
          strField1 = ""
        End If
        Set tempNode = innerNode.selectSingleNode("Field2")
        If Not (tempNode Is Nothing) Then
          strField2 = "" & tempNode.Text
        Else
          strField2 = ""
        End If
    Next innerNode
   
    'Do work
   
  Next outerNode

______________________________

I would appreciate it if someone could help me speed this up. The real data has about ten <inner> per <outer>, and about ten <outer> per XML document. I'm going to be doing about ten thousand of these XML documents at a time.

Thanks in advance.

- Paul
Avatar of DominicCronin
DominicCronin
Flag of Netherlands image

Firstly, if your situation allows you to use MSXML4 instead of 3, then do so - it is significantly faster.

For the details of your problem, perhaps you could say more about your desired result. For example, if you just want to process the data to extract some part of it or transform it into some other kind of data, then you might be able to achieve this with a simple XSLT. The performance of XSLT is sometimes astonishing.

Let us know a bit more about your problem.
Avatar of Dang123
Dang123

Learning
Avatar of paulott

ASKER

> Firstly, if your situation allows you to use MSXML4 instead of 3, then do so - it is significantly faster. < 

Yes, I can use MSXML4, and I'll give it a try. Thanks.


> For the details of your problem, perhaps you could say more about your desired result. For example, if you just want to process the data to extract some part of it or transform it into some other kind of data, then you might be able to achieve this with a simple XSLT. The performance of XSLT is sometimes astonishing.  <

"extract some part of it" is exactly what I want to do.  Previously I had put together a [poorly made] parser of my own to simply pull out the information. It was fast, but lacked robustness and was difficult to debug.  We're going to start using the application more and more, so I figured it would be worthwhile to shift it to a 'real' parser instead of my hack.  Only problem is, using the style of code in the example above caused a ~10% difference in performance compared to what I was using previously.

If I can't find a relatively easy solution, I'll probably go back and try to rewrite what I was using originally.

I did a little browsing online for topics about XSLT.  Would I be correct in assuming it would mostly be used to speed up the .selectNodes() and .selectSingleNode() parts of the code above?  I did some quick tests and it would seem that the articleDoc.loadXML() statement is by far the slowest. . .

I don't know. I'm not sure what to do. That's why I'm posting here.

Thanks.

- Paul
ASKER CERTIFIED SOLUTION
Avatar of DominicCronin
DominicCronin
Flag of Netherlands image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial