asked on

Fast XML parsing with VB 6.0

Hello.

I'm using Visual Basic 6.0, with a project reference to MS XML v3.0.

I'm new to using VB's parser with XML and would appreciate some help speeding up the following code.

The XML data is roughly in this format:

______________________________

<?xml version="1.0" encoding="UTF-8" ?>
<Outside>
<Outer>
<Title>Information</Title>
<Title2>Galore</Title2>
<Middle>
<Inner>
<Field1>Some</Field1>
<Field2>Info</Field2>
<Field3>Here</Field3>
</Inner>
<Inner>
<Field1>And</Field1>
<Field3>More</Field3>
</Inner>
</Middle>
</Outer>
<Outer>
<Title>Part II</Title>
<Middle>
<Inner>
<Field2>Data</Field2>
<Field4>All</Field3>
</Inner>
<Inner>
<Field2>Over</Field2>
<Field3>The Place</Field3>
</Inner>
</Middle>
</Outer>
</Outside>
______________________________

Currently, my program is as follows:
______________________________

Dim articleDoc As New DOMDocument, outerNode As IXMLDOMNode
Dim innerNode As IXMLDOMNode, tempNode As IXMLDOMNode
articleDoc.preserveWhiteSpace = False
articleDoc.validateOnParse = False
articleDoc.resolveExternals = False
'Load the data into the XML parser
articleDoc.loadXML (strData)
For Each outerNode In articleDoc.selectNodes("Outside/Outer")
Set tempNode = outerNode.selectSingleNode("Title")
If Not (tempNode Is Nothing) Then
strTitle = tempNode.Text
Else
strTitle = ""
End If
'We iterate through each Inner
For Each innerNode In outerNode.selectNodes("Middle/Inner")
'Pull out the information
Set tempNode = innerNode.selectSingleNode("Field1")
If Not (tempNode Is Nothing) Then
strField1 = "" & tempNode.Text
Else
strField1 = ""
End If
Set tempNode = innerNode.selectSingleNode("Field2")
If Not (tempNode Is Nothing) Then
strField2 = "" & tempNode.Text
Else
strField2 = ""
End If
Next innerNode

'Do work

Next outerNode

______________________________

I would appreciate it if someone could help me speed this up. The real data has about ten <inner> per <outer>, and about ten <outer> per XML document. I'm going to be doing about ten thousand of these XML documents at a time.

Thanks in advance.

- Paul

DominicCronin

Firstly, if your situation allows you to use MSXML4 instead of 3, then do so - it is significantly faster.

For the details of your problem, perhaps you could say more about your desired result. For example, if you just want to process the data to extract some part of it or transform it into some other kind of data, then you might be able to achieve this with a simple XSLT. The performance of XSLT is sometimes astonishing.

Let us know a bit more about your problem.

Dang123

Learning

paulott

ASKER

> Firstly, if your situation allows you to use MSXML4 instead of 3, then do so - it is significantly faster. <

Yes, I can use MSXML4, and I'll give it a try. Thanks.

> For the details of your problem, perhaps you could say more about your desired result. For example, if you just want to process the data to extract some part of it or transform it into some other kind of data, then you might be able to achieve this with a simple XSLT. The performance of XSLT is sometimes astonishing. <

"extract some part of it" is exactly what I want to do. Previously I had put together a [poorly made] parser of my own to simply pull out the information. It was fast, but lacked robustness and was difficult to debug. We're going to start using the application more and more, so I figured it would be worthwhile to shift it to a 'real' parser instead of my hack. Only problem is, using the style of code in the example above caused a ~10% difference in performance compared to what I was using previously.

If I can't find a relatively easy solution, I'll probably go back and try to rewrite what I was using originally.

I did a little browsing online for topics about XSLT. Would I be correct in assuming it would mostly be used to speed up the .selectNodes() and .selectSingleNode() parts of the code above? I did some quick tests and it would seem that the articleDoc.loadXML() statement is by far the slowest. . .

I don't know. I'm not sure what to do. That's why I'm posting here.

Thanks.

- Paul

ASKER CERTIFIED SOLUTION

DominicCronin

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial