Link to home
Create AccountLog in
Avatar of Christian Knell
Christian KnellFlag for Germany

asked on

Importing XML data into Microsoft Access table (1:n)

We import a XML document into a Microsoft Access table. The XML document derives from a huge Word document, which was XML tagged accordingly, using the attached XSD code. Tagging within word, saving as XML and importing into Access works fine. The result is a table with three columns: department, manager, representative.

Now we have some departments with more than one representative (1 department : n representatives). Is it possible to import a XML file with 1:n structure into Access? How would the XSD code have to look like?

Thanks!
Chris


<?xml version="1.0" encoding="UTF-8"?>

<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:od="urn:schemas-microsoft-com:officedata">

    <xsd:element name="dataroot">
    <xsd:complexType>
        <xsd:sequence>
            <xsd:element ref="tbl_departments" minOccurs="0" maxOccurs="unbounded"/>
        </xsd:sequence>
    </xsd:complexType>
    </xsd:element>

    <xsd:element name="tbl_departments">
        <xsd:complexType>
            <xsd:sequence>
                <xsd:element name="Department" minOccurs="0" type="xsd:string"/>
                <xsd:element name="Manager" minOccurs="0" type="xsd:string"/>
                <xsd:element name="Representative" minOccurs="0" type="xsd:string"/>
            </xsd:sequence>
        </xsd:complexType>
    </xsd:element>

</xsd:schema>

Open in new window

Avatar of Jeffrey Coachman
Jeffrey Coachman
Flag of United States of America image


You can import any table you like.
;-)

If your question is "Can Access Normalize my single table data into seperate related tables during the import?"
Then the short answer is: ...No
Not with you only importing one un-normlized table (no matter what the format)

From what I know, Access can import a related table "Structure" via XML, but here the tables will have already been normalized. (into the separate related tables)


So it appears that you will have to either normalize this data in Word, somehow?...
:-O
,... or import the un-normalized data into Access and normalize it there.

Are you familiar with the Principles of Normalization (3NF, Keys, ...etc)?
If not, you can investigate using the Table Analyzer in MS Access, or post another new question here asking how to normalize your imported data


JeffCoachman
You could go for an entirely bespoke solution too in VBA, by using the "Microsoft XML" library.

If you post a small sample XML file and a sample database showing the tables you're trying to populate, I should be able to give you a starting point if you're interested?
Avatar of Christian Knell

ASKER

Thanks  a lot! I could import un-normalized data into Access and normalize it in a second step. I'm famliar with normalization, so this would be no problem.

Using XML import/export features within Word and Access or using the Microsoft XML library via VBA would both be ok for me.

What would be the easier way to achieve an unnormalized table like
   department...manager...representative
   IT...Smith...Brown#Miller
   Finance...Bush...Black#Rice
whereas n representatives per 1 department could possibly be separated values within a single text field (in this example separated by #-chars)?

I added a Word file containing the data, the XSD file (which would have to be corrected) and the MDB file.

Thanks and best regards!
Chris
departments.xsd
Sorry - now all three files in one comment.
departments.mdb
departments.doc
departments.xsd
I'm a bit confused ... where's the XML you want to import?

Apologies if I've got the wrong end of the stick!  :-)
Hi, sorry about the confusion I caused. Attached you'll find the original Word document in XML format and the XML export from Word using the function File/Save As/File Format: XML, Option "Data only".

The latter we'd like to import in unnormalized form into a single access table (if possible). Another way could be the Microsoft XML library you mentioned (that's why I attached the original Word document).

Best regards!
Chris
departments-data-only.xml
departments.xml
Here's how to do what you want using the MSXML library:
Option Explicit
Option Compare Database

Public Sub ImportData(pstrXMLFile As String)

    ' Declare variables
    Dim docDocument As MSXML2.DOMDocument  ' Document
    Dim nodLevel1 As MSXML2.IXMLDOMNode    ' Level 1 node
    Dim nodLevel2 As MSXML2.IXMLDOMNode    ' Level 2 node
    Dim nodLevel3 As MSXML2.IXMLDOMNode    ' Level 3 node
    Dim rstDepartments As DAO.Recordset    ' Departments record set

    ' Open departments record set
    On Error GoTo ImportError
    Set rstDepartments = CurrentDb.OpenRecordset("departments", dbOpenTable)

    ' Open XML file
    Set docDocument = New MSXML2.DOMDocument
    docDocument.async = False
    If docDocument.Load(pstrXMLFile) Then

        ' Parse XML file
        For Each nodLevel1 In docDocument.childNodes
            If nodLevel1.nodeName = "dataroot" Then
                For Each nodLevel2 In nodLevel1.childNodes
                    If nodLevel2.nodeName = "tbl_departments" Then
                        With rstDepartments
                            .AddNew
                            For Each nodLevel3 In nodLevel2.childNodes
                                Select Case nodLevel3.nodeName
                                    Case "Department"
                                        !Department = !Department & nodLevel3.Text
                                    Case "Manager"
                                        !Manager = !Manager & nodLevel3.Text
                                    Case "Representative"
                                        !Representatives = !Representatives & nodLevel3.Text & "#"
                                End Select
                            Next

                            ' Remove trailing # character
                            If Right(!Representatives, 1) = "#" Then
                                !Representatives = Left(!Representatives, Len(!Representatives) - 1)
                            End If
                            .Update
                        End With
                    End If
                Next
            End If
        Next
    Else
        MsgBox "Cannot parse file." & vbCr & _
               docDocument.parseError.reason, _
               vbExclamation + vbOKOnly + vbDefaultButton1
    End If


    ' Close departments record set
TidyUp:
    On Error Resume Next  ' Ignore all errors
    rstDepartments.Close
    Set rstDepartments = Nothing

    ' Close XML file
    Set docDocument = Nothing

    ' Exit
    Exit Sub

' Handle import error
ImportError:

    ' Report error
    MsgBox "Cannot parse file." & vbCr & _
           Err.Description, _
           vbExclamation + vbOKOnly + vbDefaultButton1
    Resume TidyUp

End Sub

Open in new window

ASKER CERTIFIED SOLUTION
Avatar of Jez Walters
Jez Walters
Flag of United Kingdom of Great Britain and Northern Ireland image

Link to home
membership
Create a free account to see this answer
Signing up is free and takes 30 seconds. No credit card required.
See answer
Thank you very much! I have to leave the office now. I'll try tomorrow morning and let you know.

Thanks and best regards!
Chris
That's absolutely great - thank you very much for your help!!!

I just added some code for normalization.

Best regards!
Chris
departments.mdb
departments-data-only.xml
Brilliant - thank you very much!
Looking at your database design, I see that the "department" and "manager" fields in your "departments" table are both currently defined as not required and also allow zero length.

You should make these fields as required and to not allow zero length - in other words all departments must have a name and a manager.  Similarly, the "ID department" and "representative" fields in your "representatives" table should also be required and not allow zero length.
Thanks Jez for this information. I will modify it.

Best regards!
Chris
Once you've done that however, you'll find your current import algorithm fails where you add a new department without setting the name/mananger.  You can correct this by changing your alorithm to be like this:
Option Explicit      ' Force explicit variable declarations
Option Compare Text  ' Use case-insensitive comparisons
Option Base 0        ' Start arrays at index 0

Public Sub ImportData(pstrXMLFile As String)

    ' Declare variables
    Dim astrRepresentatives() As String      ' Representatives
    Dim docDocument As MSXML2.DOMDocument    ' Document
    Dim lngRepresentative As Long            ' Representative number
    Dim nodLevel1 As MSXML2.IXMLDOMNode      ' Level 1 node
    Dim nodLevel2 As MSXML2.IXMLDOMNode      ' Level 2 node
    Dim nodLevel3 As MSXML2.IXMLDOMNode      ' Level 3 node
    Dim rstDepartments As DAO.Recordset      ' Departments record set
    Dim rstRepresentatives As DAO.Recordset  ' Representatives record set
    Dim strRepresentatives As String         ' Representatives

    ' Open record sets
    On Error GoTo ImportError
    Set rstDepartments = CurrentDb.OpenRecordset("departments", dbOpenTable)
    Set rstRepresentatives = CurrentDb.OpenRecordset("representatives", dbOpenTable)

    ' Open XML file
    Set docDocument = New MSXML2.DOMDocument
    docDocument.async = False
    If docDocument.Load(pstrXMLFile) Then

        ' Parse XML file
        For Each nodLevel1 In docDocument.childNodes
            If nodLevel1.nodeName = "dataroot" Then
                For Each nodLevel2 In nodLevel1.childNodes
                    If nodLevel2.nodeName = "tbl_departments" Then
                        strRepresentatives = ""
                        With rstDepartments
                            .AddNew
                            For Each nodLevel3 In nodLevel2.childNodes
                                Select Case nodLevel3.nodeName
                                    Case "Department"
                                        !department = !department & nodLevel3.Text
                                    Case "Manager"
                                        !manager = !manager & nodLevel3.Text
                                    Case "Representative"
                                        strRepresentatives = strRepresentatives & nodLevel3.Text & "#"
                                End Select
                            Next
                            .Update
                            .Bookmark = .LastModified
                        End With
                        astrRepresentatives = Split(strRepresentatives, "#")
                        For lngRepresentative = 0 To UBound(astrRepresentatives) - 1  ' Skip last #
                            With rstRepresentatives
                                .AddNew
                                ![ID department] = rstDepartments!id
                                !representative = astrRepresentatives(lngRepresentative)
                                .Update
                            End With
                        Next
                    End If
                Next
            End If
        Next
    Else
        MsgBox "Cannot parse file." & vbCr & _
               docDocument.parseError.reason, _
               vbExclamation + vbOKOnly + vbDefaultButton1
    End If


    ' Close record sets
TidyUp:
    On Error Resume Next  ' Ignore all errors
    rstDepartments.Close
    Set rstDepartments = Nothing
    rstRepresentatives.Close
    Set rstRepresentatives = Nothing

    ' Close XML file
    Set docDocument = Nothing

    ' Exit
    Exit Sub

' Handle import error
ImportError:

    ' Report error
    MsgBox "Cannot parse file." & vbCr & _
           Err.Description, _
           vbExclamation + vbOKOnly + vbDefaultButton1
    Resume TidyUp

End Sub

Open in new window

... and here's one I made earlier!  :-)
departments.mdb
That's absolutely great - thank you very much!

Best regards
Chris