Solved

Encoding ASCII characters 128-255 as character references

Posted on 2004-04-23
13
1,068 Views
Last Modified: 2006-11-17
How can I use the XMLDocument object to encode characters in the ascii range 128-255 as character references (such as ¿) rather than the two byte representation of the character?
0
Comment
Question by:PLavelle
  • 7
  • 4
  • 2
13 Comments
 
LVL 26

Expert Comment

by:rdcpro
ID: 10902547
You can't.  In fact, if you start out with character references in an XML object, they will be parsed into their characters, and  unless it's a markup character, or a character that should be escaped (according to HTML), it will be output as the character itself.  For example, the non-breaking space character, &#160; is parsed, and output as the actual character...it's not re-escaped.  In an XSLT transformation, some characters will be escaped, depending on whether you set the <xsl:output method="html" />

It sounds to me like you're experiencing an encoding problem, which can come from a number of different places.  If you're seeing two-byte characters, then I'm betting you're doing something where the XML (or the output of the transform) gets converted to a string.  This will force UTF-16, which is a switch in encoding, and can cause the wrong characters to be rendered. What exactly is going wrong?  

Regards,
Mike Sharp
0
 

Author Comment

by:PLavelle
ID: 10902619
I am using an XMLDocument object in .NET to write out the XML from a .NET application. When I read the XML into a VB6 application, the string contains two characters for every character over 127 in the ascii table. For instance, ¿ is replaced by ¿ .
0
 

Author Comment

by:PLavelle
ID: 10902656
Also, the XML is read into VB6 as a string, not using the DOM.
0
Webinar: Aligning, Automating, Winning

Join Dan Russo, Senior Manager of Operations Intelligence, for an in-depth discussion on how Dealertrack, leading provider of integrated digital solutions for the automotive industry, transformed their DevOps processes to increase collaboration and move with greater velocity.

 
LVL 26

Expert Comment

by:rdcpro
ID: 10902779
Strings are always UTF-16.  This is probably the issue you're having.  VB6 thinks the xml is a different encoding because of the xml declaration...or some other similar reason.   If you're going to use strings at any point, you need to make sure the file is written out as UTF-16.  In an XSLT, you could do this with:

<xsl:output method="xml" encoding="utf-16" />

Can you post some of the code that you use to write out the XML in .NET, as well as a snippet of the XML (including the declaration, if present)?  


Regards,
Mike Sharp
0
 

Author Comment

by:PLavelle
ID: 10902896
I changed it to UTF-16 encoding and it still writes out two byte characters. VB6 interprets these two byte characters as two characters instead of just one.
0
 
LVL 10

Expert Comment

by:Yury_Delendik
ID: 10902960
Try this

Dim sw As New System.IO.StreamWriter("test.xml", System.Text.Encoding.ASCII)
xmldoc.Save(sw)
sw.Close()
0
 

Author Comment

by:PLavelle
ID: 10903036
Yury, that just replaces those characters with question marks.
0
 
LVL 26

Expert Comment

by:rdcpro
ID: 10903113
Yury:  The problem with that approach is that it will simply strip the high order bytes, losing the characters.  The characters will be replaced by the ASCII codepoint 63 which is a "?".   It would be better to save it in UTF-16, if it's subsequently going to be used in a string by VB6.  For Intel based systems:

Dim sw As New System.IO.StreamWriter("test.xml", System.Text.Encoding.Unicode)

PLavelle:  How did you change the encoding to UTF-16?  You can't do this simply by changing the declaration, you must explicitly change the encoding when it's being output.  Post your code, so I can see what's going on.

Regards,
Mike Sharp
0
 

Author Comment

by:PLavelle
ID: 10903154
       Dim iRow As Int32
        Dim iCol As Int32
        Dim sValue As String
        Dim xmlDoc As New XmlDocument
        Dim oElement As XmlElement
        Dim oAttribute As XmlAttribute
        Dim xmlWriter As New XmlTextWriter("c:\test2.xml", System.Text.Encoding.Unicode)

        oElement = xmlDoc.CreateElement("Table")
        oAttribute = xmlDoc.CreateAttribute("Company")
        oAttribute.Value = Constants.Company.Name
        oElement.Attributes.Append(oAttribute)
        oAttribute = xmlDoc.CreateAttribute("Version")
        oAttribute.Value = Constants.Versions.FundTable
        oElement.Attributes.Append(oAttribute)

        xmlDoc.AppendChild(oElement)
        For iRow = 1 To grd.Rows.Count - 1
            oElement = xmlDoc.CreateElement("DocElement")
            For iCol = 0 To grd.Cols.Count - 1
                sValue = Trim(grd(iRow, iCol))
                If Trim(sValue) <> "" Then
                    oAttribute = xmlDoc.CreateAttribute(grd.Cols(iCol).Name)
                    oAttribute.Value = sValue
                    oElement.Attributes.Append(oAttribute)
                End If
            Next
            xmlDoc.DocumentElement.AppendChild(oElement)
        Next

        xmlDoc.Save(xmlWriter)
0
 

Author Comment

by:PLavelle
ID: 10903161
I had to change some names, but there is the code that writes out the XML from the grid.
0
 
LVL 26

Accepted Solution

by:
rdcpro earned 500 total points
ID: 10903297
Hmmm... I don't see anything wrong there.  I'd say the data is messed up either before this (when the grd is filled out) or during the load in VB6.  You could try ISO-8859-1 for the output encoding...but I'd first try to specify

Dim xmlWriter As New XmlTextWriter("c:\test2.xml", System.Text.Encoding.Default)

This will cause the encoding to be whatever your systems default codepage is, and hopefully the VB6 app will figure it out when it reads the file.

Let me know how this works!

Regards,
Mike Sharp
0
 
LVL 10

Expert Comment

by:Yury_Delendik
ID: 10907704
What are you using to read data in VB6? FileSystem Object ot standard VB statements?
0
 

Author Comment

by:PLavelle
ID: 10929366
The ISO-8859-1 encoding seems to write out the correct character (1 byte). Thanks for all of your help.
0

Featured Post

DevOps Toolchain Recommendations

Read this Gartner Research Note and discover how your IT organization can automate and optimize DevOps processes using a toolchain architecture.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This document covers how to connect to SQL Server and browse its contents.  It is meant for those new to Visual Studio and/or working with Microsoft SQL Server.  It is not a guide to building SQL Server database connections in your code.  This is mo…
Wouldn’t it be nice if you could test whether an element is contained in an array by using a Contains method just like the one available on List objects? Wouldn’t it be good if you could write code like this? (CODE) In .NET 3.5, this is possible…
Although Jacob Bernoulli (1654-1705) has been credited as the creator of "Binomial Distribution Table", Gottfried Leibniz (1646-1716) did his dissertation on the subject in 1666; Leibniz you may recall is the co-inventor of "Calculus" and beat Isaac…
In a recent question (https://www.experts-exchange.com/questions/29004105/Run-AutoHotkey-script-directly-from-Notepad.html) here at Experts Exchange, a member asked how to run an AutoHotkey script (.AHK) directly from Notepad++ (aka NPP). This video…

828 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question