• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 1094
  • Last Modified:

Encoding ASCII characters 128-255 as character references

How can I use the XMLDocument object to encode characters in the ascii range 128-255 as character references (such as ¿) rather than the two byte representation of the character?
0
PLavelle
Asked:
PLavelle
  • 7
  • 4
  • 2
1 Solution
 
rdcproCommented:
You can't.  In fact, if you start out with character references in an XML object, they will be parsed into their characters, and  unless it's a markup character, or a character that should be escaped (according to HTML), it will be output as the character itself.  For example, the non-breaking space character, &#160; is parsed, and output as the actual character...it's not re-escaped.  In an XSLT transformation, some characters will be escaped, depending on whether you set the <xsl:output method="html" />

It sounds to me like you're experiencing an encoding problem, which can come from a number of different places.  If you're seeing two-byte characters, then I'm betting you're doing something where the XML (or the output of the transform) gets converted to a string.  This will force UTF-16, which is a switch in encoding, and can cause the wrong characters to be rendered. What exactly is going wrong?  

Regards,
Mike Sharp
0
 
PLavelleAuthor Commented:
I am using an XMLDocument object in .NET to write out the XML from a .NET application. When I read the XML into a VB6 application, the string contains two characters for every character over 127 in the ascii table. For instance, ¿ is replaced by ¿ .
0
 
PLavelleAuthor Commented:
Also, the XML is read into VB6 as a string, not using the DOM.
0
Nothing ever in the clear!

This technical paper will help you implement VMware’s VM encryption as well as implement Veeam encryption which together will achieve the nothing ever in the clear goal. If a bad guy steals VMs, backups or traffic they get nothing.

 
rdcproCommented:
Strings are always UTF-16.  This is probably the issue you're having.  VB6 thinks the xml is a different encoding because of the xml declaration...or some other similar reason.   If you're going to use strings at any point, you need to make sure the file is written out as UTF-16.  In an XSLT, you could do this with:

<xsl:output method="xml" encoding="utf-16" />

Can you post some of the code that you use to write out the XML in .NET, as well as a snippet of the XML (including the declaration, if present)?  


Regards,
Mike Sharp
0
 
PLavelleAuthor Commented:
I changed it to UTF-16 encoding and it still writes out two byte characters. VB6 interprets these two byte characters as two characters instead of just one.
0
 
Yury_DelendikCommented:
Try this

Dim sw As New System.IO.StreamWriter("test.xml", System.Text.Encoding.ASCII)
xmldoc.Save(sw)
sw.Close()
0
 
PLavelleAuthor Commented:
Yury, that just replaces those characters with question marks.
0
 
rdcproCommented:
Yury:  The problem with that approach is that it will simply strip the high order bytes, losing the characters.  The characters will be replaced by the ASCII codepoint 63 which is a "?".   It would be better to save it in UTF-16, if it's subsequently going to be used in a string by VB6.  For Intel based systems:

Dim sw As New System.IO.StreamWriter("test.xml", System.Text.Encoding.Unicode)

PLavelle:  How did you change the encoding to UTF-16?  You can't do this simply by changing the declaration, you must explicitly change the encoding when it's being output.  Post your code, so I can see what's going on.

Regards,
Mike Sharp
0
 
PLavelleAuthor Commented:
       Dim iRow As Int32
        Dim iCol As Int32
        Dim sValue As String
        Dim xmlDoc As New XmlDocument
        Dim oElement As XmlElement
        Dim oAttribute As XmlAttribute
        Dim xmlWriter As New XmlTextWriter("c:\test2.xml", System.Text.Encoding.Unicode)

        oElement = xmlDoc.CreateElement("Table")
        oAttribute = xmlDoc.CreateAttribute("Company")
        oAttribute.Value = Constants.Company.Name
        oElement.Attributes.Append(oAttribute)
        oAttribute = xmlDoc.CreateAttribute("Version")
        oAttribute.Value = Constants.Versions.FundTable
        oElement.Attributes.Append(oAttribute)

        xmlDoc.AppendChild(oElement)
        For iRow = 1 To grd.Rows.Count - 1
            oElement = xmlDoc.CreateElement("DocElement")
            For iCol = 0 To grd.Cols.Count - 1
                sValue = Trim(grd(iRow, iCol))
                If Trim(sValue) <> "" Then
                    oAttribute = xmlDoc.CreateAttribute(grd.Cols(iCol).Name)
                    oAttribute.Value = sValue
                    oElement.Attributes.Append(oAttribute)
                End If
            Next
            xmlDoc.DocumentElement.AppendChild(oElement)
        Next

        xmlDoc.Save(xmlWriter)
0
 
PLavelleAuthor Commented:
I had to change some names, but there is the code that writes out the XML from the grid.
0
 
rdcproCommented:
Hmmm... I don't see anything wrong there.  I'd say the data is messed up either before this (when the grd is filled out) or during the load in VB6.  You could try ISO-8859-1 for the output encoding...but I'd first try to specify

Dim xmlWriter As New XmlTextWriter("c:\test2.xml", System.Text.Encoding.Default)

This will cause the encoding to be whatever your systems default codepage is, and hopefully the VB6 app will figure it out when it reads the file.

Let me know how this works!

Regards,
Mike Sharp
0
 
Yury_DelendikCommented:
What are you using to read data in VB6? FileSystem Object ot standard VB statements?
0
 
PLavelleAuthor Commented:
The ISO-8859-1 encoding seems to write out the correct character (1 byte). Thanks for all of your help.
0

Featured Post

Nothing ever in the clear!

This technical paper will help you implement VMware’s VM encryption as well as implement Veeam encryption which together will achieve the nothing ever in the clear goal. If a bad guy steals VMs, backups or traffic they get nothing.

  • 7
  • 4
  • 2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now