Link to home
Start Free TrialLog in
Avatar of Frosty555
Frosty555Flag for Canada

asked on

Making XML file using Excel macro

I'm making an XML file from some data in an excel spreadsheet. I'm using a VBA macro to do it. I basically build a big string called XMLFileText by tacking on the various values of different cells. I use the [CDATA ... ]] operator in XML so that I don't have to escape some the text, as you'll see in the code snippet.

Then I write it out to an XML file using FileSystemObject.TextStream.Write().

But my problem is that the excel file sometimes contains french characters, and that little << >> french style quote (I think it's called a "guillemets"), and other weird characters. When there are non-ascii characters in the string, The TextStream.Write() call fails with "Invalid Procedure Call Or Arguement".

I also know XML itself doesn't just handle unicode text. You need to excape it somehow, and I don't know if the [[CDATA ]] construct does it for me or not.

The code below isn't all my code, just an example. So can someone help me format the excel cells in a way that I can encode it into a standard ascii xml file using file system object?

Dim ERsheet As Worksheet
    Set ERsheet = ActiveWorkbook.Worksheets("DATA")
 
    Dim XMLFileText as String
    XMLFileText = ""
    XMLFileText = XMLFileText & "<?xml version=""1.0"" encoding=""ISO8859-1"" ?>" & vbNewLine
    XMLFileText = XMLFileText & "<!DOCTYPE EXAM SYSTEM ""enhexam.dtd"">" & vbNewLine
    XMLFileText = XMLFileText & tabs(0) & UCase("<EXAM>") & vbNewLine
    ....
                             XMLFileText = XMLFileText & tabs(3) & UCase("<question><![CDATA[") & Trim(.Range(newcolname(2)).Value) & UCase("]]></question>") & vbNewLine
                            XMLFileText = XMLFileText & tabs(3) & UCase("<response1><![CDATA[") & Trim(.Range(newcolname(3)).Value) & UCase("]]></response1>") & vbNewLine
                            XMLFileText = XMLFileText & tabs(3) & UCase("<message1><![CDATA[") & Trim(.Range(newcolname(4)).Value) & UCase("]]></message1>") & vbNewLine
                            XMLFileText = XMLFileText & tabs(3) & UCase("<response2><![CDATA[") & Trim(.Range(newcolname(5)).Value) & UCase("]]></response2>") & vbNewLine
                            XMLFileText = XMLFileText & tabs(3) & UCase("<message2><![CDATA[") & Trim(.Range(newcolname(6)).Value) & UCase("]]></message2>") & vbNewLine
                            XMLFileText = XMLFileText & tabs(3) & UCase("<response3><![CDATA[") & Trim(.Range(newcolname(7)).Value) & UCase("]]></response3>") & vbNewLine
                            XMLFileText = XMLFileText & tabs(3) & UCase("<message3><![CDATA[") & Trim(.Range(newcolname(8)).Value) & UCase("]]></message3>") & vbNewLine
 
 
 
    If createfileflag = True Then
        Dim FSO As Object
        Dim newFile As Object
        Set FSO = CreateObject("Scripting.FileSystemObject")
        Set newFile = FSO.CreateTextFile(filepath)
        newFile.Write (XMLFileText)
        newFile.Close
        Set newFile = Nothing
        Set FSO = Nothing
   End If

Open in new window

SOLUTION
Avatar of kawas
kawas
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of Frosty555

ASKER

We've already got the whole thing built unfortunately.... I'd rather not overhaul the whole project.

Can you let me know what the correct encoding should be in the header for it to be UTF-8? And how to escape unicode characters?
And... if you know, what the escape code rules are when it is inside a [CDATA ]] operation? I don't know if escape codes inside of CDATA will still be honored?
Okay, so refresh my own memory, VB keeps all of its strings in Unicode format, internally.

Upon calling TextStream.write(), when I pass it a unicode string that has non-ascii characters in it, it fails.

The particular line of the offending text was:

"Désolé, c'est inexact. Les projecteurs ?CL utilisent la technologie analogique pour produire des images de grande qualité."

The "?" character in the middle turned out to be unicode 1040, but in excel it looks like a capital "A".

So how do I encode that in my XML file? If I change the encoding to UTF-8, what do I write in the xml file?
ASKER CERTIFIED SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial