keschuster
asked on
Parsing XML in Access
I have a field in Access that contains
<data-xml><asset-columns misc-10-enabled="yes" misc-10-name="PC" misc-1-enabled="yes" misc-1-name="Make" misc-2-enabled="yes" misc-2-name="Model" misc-3-enabled="yes" misc-3-name="Serial" misc-4-enabled="no" misc-5-enabled="no" misc-6-enabled="no" misc-7-enabled="no" misc-8-enabled="yes" misc-8-name="Status Code" misc-9-enabled="yes" misc-9-name="Original Company" /><location-columns misc-10-enabled="yes" misc-10-name="PC" misc-1-enabled="no" misc-2-enabled="no" misc-3-enabled="no" misc-4-enabled="no" misc-5-enabled="no" misc-6-enabled="no" misc-7-enabled="no" misc-8-enabled="no" misc-9-enabled="yes" misc-9-name="Original Company" /></data-xml>
Is there a way to parse this out so it is usable - perhaps into another table?
<data-xml><asset-columns misc-10-enabled="yes" misc-10-name="PC" misc-1-enabled="yes" misc-1-name="Make" misc-2-enabled="yes" misc-2-name="Model" misc-3-enabled="yes" misc-3-name="Serial" misc-4-enabled="no" misc-5-enabled="no" misc-6-enabled="no" misc-7-enabled="no" misc-8-enabled="yes" misc-8-name="Status Code" misc-9-enabled="yes" misc-9-name="Original Company" /><location-columns misc-10-enabled="yes" misc-10-name="PC" misc-1-enabled="no" misc-2-enabled="no" misc-3-enabled="no" misc-4-enabled="no" misc-5-enabled="no" misc-6-enabled="no" misc-7-enabled="no" misc-8-enabled="no" misc-9-enabled="yes" misc-9-name="Original Company" /></data-xml>
Is there a way to parse this out so it is usable - perhaps into another table?
what do you mean by usable?
What is your definition of 'usable' ?
mx
mx
ASKER
name value pairs in two columns
misc-10-name="PC" should output to misc-1 and PC
misc-2-name="Model" => misc-2 and Model
there will always be the two groups in the XML "asset columns" and "location columns"
Within each there will always be misc-1 thru misc-10
The values assinged to misc1 - 10 will always change
misc-10-name="PC" should output to misc-1 and PC
misc-2-name="Model" => misc-2 and Model
there will always be the two groups in the XML "asset columns" and "location columns"
Within each there will always be misc-1 thru misc-10
The values assinged to misc1 - 10 will always change
do you have a table to hold the values?
post the table structure here.
post the table structure here.
you have to use VBA and open the table as recordset to do this.
how is your VBA?
how is your VBA?
ASKER
I don't but it would simply be
id - autoincrement
field - text
value - text
id - autoincrement
field - text
value - text
i don't know what you are thinking...
post sample data you want to see from the table after parsing the XML field.
post sample data you want to see from the table after parsing the XML field.
See this {http://#a21435668}
ASKER
This xml is comes in one column in an access table just as it appears in my initial post. From this I would like to see
type field value
asset misc-10 PC
asset misc-1 Make
....
location misc-10 PC
location misc-1 null
location misc-2 null
....
location misc-9 Original Company
type field value
asset misc-10 PC
asset misc-1 Make
....
location misc-10 PC
location misc-1 null
location misc-2 null
....
location misc-9 Original Company
<data-xml>
<asset-columns misc-10-enabled="yes" misc-10-name="PC" misc-1-enabled="yes" misc-1-name="Make" misc-2-enabled="yes" misc-2-name="Model" misc-3-enabled="yes" misc-3-name="Serial" misc-4-enabled="no" misc-5-enabled="no" misc-6-enabled="no" misc-7-enabled="no" misc-8-enabled="yes" misc-8-name="Status Code" misc-9-enabled="yes" misc-9-name="Original Company" />
<location-columns misc-10-enabled="yes" misc-10-name="PC" misc-1-enabled="no" misc-2-enabled="no" misc-3-enabled="no" misc-4-enabled="no" misc-5-enabled="no" misc-6-enabled="no" misc-7-enabled="no" misc-8-enabled="no" misc-9-enabled="yes" misc-9-name="Original Company" />
</data-xml>
ASKER
Bad link
"Bad link"
Sorry ... it was meant to jump to where you showed the Name/Value pair examples.
mx
Sorry ... it was meant to jump to where you showed the Name/Value pair examples.
mx
This XML data is not well formed. The data is presented as attributes instead of fields. There is no way Access will understand that (it normally skips attributes). If your XML was well formed, you could save it as a file and import it into a table.
As it stands, the only way I see would be to write your own parser. For example: find the string "<asset-columns" and from that point the next ">". Store that portion as string strAsset. Then look in turn for "misc-1-name" to "misc-10-name", and then extract whatever is between the next pair of double quotes. Note that you can neglect the "misc-?-enabled" attributes: they simply indicate if there is a value for the matching "name" attribute.
This can all be done with the functions InStr() and Mid(), but it's an actual development project. Small, but still a VB project in its own right.
Are you certain you cannot obtain the data in a more straightforward format?
(°v°)
As it stands, the only way I see would be to write your own parser. For example: find the string "<asset-columns" and from that point the next ">". Store that portion as string strAsset. Then look in turn for "misc-1-name" to "misc-10-name", and then extract whatever is between the next pair of double quotes. Note that you can neglect the "misc-?-enabled" attributes: they simply indicate if there is a value for the matching "name" attribute.
This can all be done with the functions InStr() and Mid(), but it's an actual development project. Small, but still a VB project in its own right.
Are you certain you cannot obtain the data in a more straightforward format?
(°v°)
ASKER
thats how the data is coming. It's stored in the db like that. SO I would need to build a parser to move through the xml, grab the values, then insert each name/value into a table when a pair is found.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Nice work harfang..right on the money. I just went down this road myself not too long ago.
It can be a little tricky walking the XML file. Best approach is to consider it a typical Tree structure. I found this:
http://www.w3schools.com/xml/default.asp
which is an excellent step by step on understanding XML.
I also found this:
http://support.microsoft.com/kb/285329
Which seems to be a generic way of handling this problem, but haven't had a chance to try it out yet. I went the route you did.
HTH,
JimD
ASKER
harfang this looks good. It should work as an access sub right?
FWIW, I posted the procedure I came up with to parse an XML file which contained time card data in a denormalized fashion (there was Emp1, Emp2, Emp3, ....Emp7 fields for each section of the time card - weird setup).
I'm posting this because:
1. It shows how to load a XML file directly from disk.
2. Shows both loop and fixed fetch logic.
Hopefully this will be of benefit to someone.
JimD.
PS - NOT expecting any points!
I'm posting this because:
1. It shows how to load a XML file directly from disk.
2. Shows both loop and fixed fetch logic.
Hopefully this will be of benefit to someone.
JimD.
PS - NOT expecting any points!
Public Function ImportXMLFile(ByVal strFile As String) As Boolean
' OCS - JRD - 04/05/08 - Initial write.
' This function will process a single "time form" XML file
' and makes fundamental assumptions about the format of the XML file.
' If the XML file format is changed in any way, this code will fail.
' Each file is expected to have the following data:
'
' SmartformInstance (multiple per file)
' Form (one set per form instance)
' JobInfo (one set per form instance)
' TravelStart (one set per form instance)
' TravelEnd (one set per form instance)
' ClockIn (one set per form instance)
' StartBreak (one set per form instance)
' EndBreak (one set per form instance)
' ClockOut (one set per form instance)
'
' For the sections (Travel Start through Clock Out), seven (7)
' employee fields are expected.
Dim objDomDoc As New DOMDocument
Dim objNodeSmartForm As IXMLDOMNode
Dim objNode_Sections As IXMLDOMNode
Dim objNodeData As IXMLDOMNode
Dim objNodeItem As IXMLDOMNode
Dim dbs As DAO.Database
Dim rst As DAO.Recordset
Dim strSQL As String
Dim lngForm As Long
Dim intSection As Integer
Dim intEmployee As Integer
Dim strInstanceID As String
Dim strInstanceDate As String
Dim strInstanceType As String
Dim strFormName As String
Dim strJobNumber As String
Dim strPerDiem As String
Dim strValue As String
10 On Error GoTo ImportXMLFile_Error
20 ImportXMLFile = False
' Open a DAO Database connection
30 Set dbs = CurrentDb()
' Verify that such a file exists. If not, then exit
40 If Dir(strFile) <> "" Then
' Open DOM object by loading the xml file.
50 objDomDoc.async = False
60 objDomDoc.Load strFile
' Loop through the form nodes.
70 For Each objNodeSmartForm In objDomDoc.documentElement.childNodes
' Get the header info
80 strInstanceID = Nz(objNodeSmartForm.childNodes(0).Text, "")
90 strInstanceDate = Nz(objNodeSmartForm.childNodes(1).Text, "")
100 strInstanceType = Nz(objNodeSmartForm.childNodes(2).Text, "")
110 strFormName = Nz(objNodeSmartForm.childNodes(3).childNodes(0).Text, "")
120 strJobNumber = Nz(objNodeSmartForm.childNodes(3).childNodes(2).childNodes(1).childNodes(2).Text, "")
130 strPerDiem = Nz(objNodeSmartForm.childNodes(3).childNodes(2).childNodes(2).childNodes(2).Text, "")
' Save header info
140 strSQL = "SELECT * FROM tblSmartFormInstance WHERE 1=0"
150 Set rst = dbs.OpenRecordset(strSQL, dbOpenDynaset)
160 rst.AddNew
170 rst![FormInstanceID] = strInstanceID
180 rst![FormInstanceDate] = CDate(Mid(strInstanceDate, 6, 2) & "/" & Mid(strInstanceDate, 9, 2) & "/" & Mid(strInstanceDate, 1, 4) & " " & Mid(strInstanceDate, 12, 8))
190 rst![TypeCode] = strInstanceType
200 rst![FormName] = strFormName
210 rst![JobNumber] = strJobNumber
220 rst![PerDiem] = strPerDiem
230 rst.Update
240 rst.Close
' Now get the details for the form
' Open detail table
250 strSQL = "SELECT * FROM tblSmartFormInstanceDetail WHERE 1=0"
260 Set rst = dbs.OpenRecordset(strSQL, dbOpenDynaset)
' Process the sections. The code below depends on the same sections
' being there each time, having the same number of elements,
' and being in the same order.
' A more flexible approach would be to check the element data,
' then do a lookup on the section and field names and map to the tables.
' However we are doing only one form at the moment and the data will
' always be provided in the current format.
270 Set objNode_Sections = objNodeSmartForm.childNodes(3)
280 For intSection = 3 To 8
290 Set objNodeData = objNode_Sections.childNodes(intSection)
' objNodeData.childNodes(0).text is the section name
300 For intEmployee = 1 To 7
310 Set objNodeItem = objNodeData.childNodes(intEmployee)
320 If Nz(objNodeItem.childNodes(2).Text, "") <> "" Then
330 strValue = Nz(objNodeItem.childNodes(2).Text, "")
340 If strValue <> "" Then
350 rst.AddNew
360 rst![FormInstanceID] = strInstanceID
370 rst![SectionTypeID] = intSection - 2
380 rst![EmployeeListOrder] = intEmployee
390 rst![EmployeeName] = strValue
400 rst.Update
410 End If
420 End If
430 Set objNodeItem = Nothing
440 Next intEmployee
450 Set objNodeData = Nothing
460 Next intSection
470 rst.Close
480 Next
490 ImportXMLFile = True
500 End If
ImportXMLFile_Exit:
510 On Error Resume Next
' Clean up
520 Set rst = Nothing
530 Set dbs = Nothing
540 Set objNodeItem = Nothing
550 Set objNodeData = Nothing
560 Set objNode_Sections = Nothing
570 Set objNodeSmartForm = Nothing
580 Set objDomDoc = Nothing
590 Exit Function
ImportXMLFile_Error:
600 MsgBox "Unexpected Error"
610 Resume ImportXMLFile_Exit
End Function
ASKER
Good stuff - I'll try to work with these today
> It should work as an access sub right?
Let's say it's a template. The sub ParseXML accepts a string containing XML data in a format similar to your sample, and prints out all items in the first "root" element, the single <data-xml> in this case, along with their attributes.
You probably don't want to print them to the Immediate pane. You will have to adapt the code to do something useful with the data, for example storing it in a table. But the way to do this is certainly related to other fields of the table, so I left that up to you: do you want to update the same table? create a new table? if so what would be the key? how are you going to use the data? does it need to be normalized as properties or as fields? ...
@JimD: "I'm posting this because..."
You also demonstrate how to store the retrieved data in a table, and I'm sure this will be useful. Btw, I haven't seen numbered lines in over 20 years... I wonder why you did that.
Cheers!
(°v°)
Let's say it's a template. The sub ParseXML accepts a string containing XML data in a format similar to your sample, and prints out all items in the first "root" element, the single <data-xml> in this case, along with their attributes.
You probably don't want to print them to the Immediate pane. You will have to adapt the code to do something useful with the data, for example storing it in a table. But the way to do this is certainly related to other fields of the table, so I left that up to you: do you want to update the same table? create a new table? if so what would be the key? how are you going to use the data? does it need to be normalized as properties or as fields? ...
@JimD: "I'm posting this because..."
You also demonstrate how to store the retrieved data in a table, and I'm sure this will be useful. Btw, I haven't seen numbered lines in over 20 years... I wonder why you did that.
Cheers!
(°v°)
<<You also demonstrate how to store the retrieved data in a table, and I'm sure this will be useful. Btw, I haven't seen numbered lines in over 20 years... I wonder why you did that.>>
I use the undocumented vba.erl to return the line number in my error handlers. Speeds up troubleshooting.
JimD.
I use the undocumented vba.erl to return the line number in my error handlers. Speeds up troubleshooting.
JimD.
> the undocumented vba.erl
I see, thanks for the info.
(°v°)
I see, thanks for the info.
(°v°)
I should explain a bit more. The code above was done for someone else and out of habit, I added the line numbers. But my normal error handler looks like this:
410 UnexpectedError ModuleName, RoutineName, Version, Err.Number, Err.Description, Err.Source, VBA.Erl
So I get feedback on the module, routine, and exact line number where an error occured, which speeds up the trouble shooting process. The UnexpectedError procedure writes an entry to a log file and optionally e-mails me with an error.
JimD.
That makes perfect sense. I know of many instances where this would have saved me hours! Too bad the simple "source text" line number isn't available... but then again writing a code numbering tool is quite simple as well. I should seriously consider this!
(°v°)
(°v°)
<<That makes perfect sense. I know of many instances where this would have saved me hours! Too bad the simple "source text" line number isn't available... but then again writing a code numbering tool is quite simple as well. I should seriously consider this!>>
Check out MZ Tools:
http://www.mztools.com/v3/mztools3.aspx
It has the line numbering built-in along with a host of other features. Best part; it's free<g>.
JimD.
Check out MZ Tools:
http://www.mztools.com/v3/mztools3.aspx
It has the line numbering built-in along with a host of other features. Best part; it's free<g>.
JimD.