Dale Fye
asked on
Merging XML documents
I've got a client who has a legacy application which downloads XML files from an FTP site, merges those documents into a single file, and then uploads the data to an Access database. The general format of these downloaded XML documents looks like the following, with a node for filename and then one or more orders.
The code for that operation currently looks like:
Any ideas for how to modify this process to get the filename associated with each file embedded in each order?
The code for that operation currently looks like:
Public Sub MergeOrders()
Dim oFileSys
Dim oFolder
Dim aFiles
Dim file
Dim xmlPath1
Dim xmlPath2
Dim xmldoc1: Set xmldoc1 = CreateObject("MSXML2.DomDocument")
Dim xmldoc2: Set xmldoc2 = CreateObject("MSXML2.DomDocument")
Dim Orders
Dim Order
Dim NodeToRemove
Dim strZeroLenFiles
Set NodeToRemove = xmldoc1.selectNodes("/HaworthOrders/FileName")
Set oFileSys = WScript.CreateObject("Scripting.FileSystemObject")
Set oFolder = oFileSys.GetFolder(ftpLocalDir)
Set aFiles = oFolder.Files
strZeroLenFiles = ""
For Each file In aFiles
If file.size = 0 then
if strzerolenfiles <> "" then strZeroLenFiles = strZeroLenFiles & vbCrLf
strZeroLenFiles = strZeroLenFiles & File.Name
Else
If len(xmlPath1) = 0 then
xmlPath1 = oFolder +"\" + file.name
xmldoc1.async = False
xmldoc1.load xmlpath1
xmldoc1.setProperty "SelectionLanguage", "XPath"
Else
xmlPath2 = oFolder +"\" + file.name
End If
If len(xmlPath1) > 0 And len(xmlPath2) > 0 Then
xmldoc2.async = False
xmldoc2.load xmlpath2
set Orders = xmldoc2.selectNodes("/HaworthOrders/Order")
For Each Order In Orders
Dim D2: Set D2 = Order.cloneNode(True)
xmldoc1.documentElement.appendChild D2
Next
End If
End If 'zero len file
Next
For Each node in NodeToRemove
node.parentNode.removeChild(node)
Next
xmldoc1.save mergedFileLocation
End if
End Sub
But this code drops the Filename from the merged document. Unfortunately, the client now needs to have the FileName from each of the xml documents include in the database, so I need to keep this field in the document, and actually need to have the filename associated with each order. When I remarked out lines 55-57, and ran the merge process against two XML documents, each with a single order, I only get a single Filename node, like below.Any ideas for how to modify this process to get the filename associated with each file embedded in each order?
Post two sample files
In the loop where you get the Orders, you want to add a new node or element to those Orders before saving them to the new XML document. Something like below might be a starting point, based on the existing code, assuming there is always only one Filename node in the file. If you want to upload a sample of the actual XML input file I could do more testing here to refine if needed.
»bp
For Each Order In Orders
Dim D2: Set D2 = Order.cloneNode(True)
D2.documentElement.appendChild NodeToRemove
xmldoc1.documentElement.appendChild D2
Next
»bp
ASKER
ASKER
Here are some modified copies of two files:
0000013261_ClientName_126886072_2016.xml
0000013261_ClientName_126886084_2016.xml
0000013261_ClientName_126886072_2016.xml
0000013261_ClientName_126886084_2016.xml
I may just be slow this morning, but isn't there an extra END IF at the bottom of the procedure as well, I'm having trouble lining that up with an IF? But you're not getting an error about that there, which is odd...
»bp
»bp
You will have to change the schema and move filename into the order
ASKER
Bill,
Yes, there is an extra "End If" at the end of the code.
I had an If statement just above that which was just displaying a message if a particular condition was met.
I deleted it and somehow failed to remove the "End If"
Unfortunately, the cut and paste operation from Notepad ++ didn't work as cleanly (formatting wise) as I had hoped.
Dale
Yes, there is an extra "End If" at the end of the code.
I had an If statement just above that which was just displaying a message if a particular condition was met.
I deleted it and somehow failed to remove the "End If"
Unfortunately, the cut and paste operation from Notepad ++ didn't work as cleanly (formatting wise) as I had hoped.
Dale
ASKER
@Shaun,
Don't know how to do that. Was hoping that Bills suggestion would work to do just that.
Don't know how to do that. Was hoping that Bills suggestion would work to do just that.
Thanks for the test files Dale, I'll do some testing here now...
»bp
»bp
Dale,
Can you clarify the intent just a little bit. I see different root nodes in the two XML files, one "<PurchaserOrders>" and the other "<HaworthOrders>". I think you are trying to append the second to the Orders in the first, yes? But in the loop that processes the files in the folder, how do you know you will hit the assumed "<PurchaserOrders>" XML file as the first file Windows presents from that folder?
Am I correct that ftpLocalDir and mergedFileLocation are set in the main program before this routine is called?
»bp
Can you clarify the intent just a little bit. I see different root nodes in the two XML files, one "<PurchaserOrders>" and the other "<HaworthOrders>". I think you are trying to append the second to the Orders in the first, yes? But in the loop that processes the files in the folder, how do you know you will hit the assumed "<PurchaserOrders>" XML file as the first file Windows presents from that folder?
Am I correct that ftpLocalDir and mergedFileLocation are set in the main program before this routine is called?
»bp
ASKER
Bill, I meant to clean those both up so that they are the same, but I also notices that I forgot to change the code to reflect "/PurchaseOrders/FileName" on lines 15 and 45 in my original code (I was trying to obscure the originator of the files as much as possible).
I inherited this application from another developer several years ago, and since I'm not a vbScript guy (Access all the way!), I've avoided making any more changes than I've had to.
yes, those two varaibles are set at the beginning of the script file. The application currently connects to the ftp site of one of my clients purchasers. If there are files in the \Out\ folder, it downloads those and then the merge code merges all of the orders from each of those files into a single xml document, which is then parsed into individual orders and written to an Access table. But as you see when you look at those files, the "Filename" is not an attribute of the order, it is an element of the root node. And I am unsure how to change the structure of the XML file that the data is being merged into to make the filename an element of each order.
Thanks for looking at this.
Dale
I inherited this application from another developer several years ago, and since I'm not a vbScript guy (Access all the way!), I've avoided making any more changes than I've had to.
yes, those two varaibles are set at the beginning of the script file. The application currently connects to the ftp site of one of my clients purchasers. If there are files in the \Out\ folder, it downloads those and then the merge code merges all of the orders from each of those files into a single xml document, which is then parsed into individual orders and written to an Access table. But as you see when you look at those files, the "Filename" is not an attribute of the order, it is an element of the root node. And I am unsure how to change the structure of the XML file that the data is being merged into to make the filename an element of each order.
Thanks for looking at this.
Dale
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Thanks, Bill.
I'll take a look this afternoon.
I'm certain there is a more efficient way to move forward, but cannot get the client to commit to the time for a major rewrite.
Dale
I'll take a look this afternoon.
I'm certain there is a more efficient way to move forward, but cannot get the client to commit to the time for a major rewrite.
Dale
ASKER
That worked great, Bill.
Thanks for the assist!
Thanks for the assist!
Welcome Dale, glad that helped.
»bp
»bp
ASKER
Hey, guys, there is a follow on question here