?
Solved

Encoding ASCII characters 128-255 as character references

Posted on 2004-04-23
13
Medium Priority
?
1,080 Views
Last Modified: 2006-11-17
How can I use the XMLDocument object to encode characters in the ascii range 128-255 as character references (such as ¿) rather than the two byte representation of the character?
0
Comment
Question by:PLavelle
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 7
  • 4
  • 2
13 Comments
 
LVL 26

Expert Comment

by:rdcpro
ID: 10902547
You can't.  In fact, if you start out with character references in an XML object, they will be parsed into their characters, and  unless it's a markup character, or a character that should be escaped (according to HTML), it will be output as the character itself.  For example, the non-breaking space character, &#160; is parsed, and output as the actual character...it's not re-escaped.  In an XSLT transformation, some characters will be escaped, depending on whether you set the <xsl:output method="html" />

It sounds to me like you're experiencing an encoding problem, which can come from a number of different places.  If you're seeing two-byte characters, then I'm betting you're doing something where the XML (or the output of the transform) gets converted to a string.  This will force UTF-16, which is a switch in encoding, and can cause the wrong characters to be rendered. What exactly is going wrong?  

Regards,
Mike Sharp
0
 

Author Comment

by:PLavelle
ID: 10902619
I am using an XMLDocument object in .NET to write out the XML from a .NET application. When I read the XML into a VB6 application, the string contains two characters for every character over 127 in the ascii table. For instance, ¿ is replaced by ¿ .
0
 

Author Comment

by:PLavelle
ID: 10902656
Also, the XML is read into VB6 as a string, not using the DOM.
0
Veeam Disaster Recovery in Microsoft Azure

Veeam PN for Microsoft Azure is a FREE solution designed to simplify and automate the setup of a DR site in Microsoft Azure using lightweight software-defined networking. It reduces the complexity of VPN deployments and is designed for businesses of ALL sizes.

 
LVL 26

Expert Comment

by:rdcpro
ID: 10902779
Strings are always UTF-16.  This is probably the issue you're having.  VB6 thinks the xml is a different encoding because of the xml declaration...or some other similar reason.   If you're going to use strings at any point, you need to make sure the file is written out as UTF-16.  In an XSLT, you could do this with:

<xsl:output method="xml" encoding="utf-16" />

Can you post some of the code that you use to write out the XML in .NET, as well as a snippet of the XML (including the declaration, if present)?  


Regards,
Mike Sharp
0
 

Author Comment

by:PLavelle
ID: 10902896
I changed it to UTF-16 encoding and it still writes out two byte characters. VB6 interprets these two byte characters as two characters instead of just one.
0
 
LVL 10

Expert Comment

by:Yury_Delendik
ID: 10902960
Try this

Dim sw As New System.IO.StreamWriter("test.xml", System.Text.Encoding.ASCII)
xmldoc.Save(sw)
sw.Close()
0
 

Author Comment

by:PLavelle
ID: 10903036
Yury, that just replaces those characters with question marks.
0
 
LVL 26

Expert Comment

by:rdcpro
ID: 10903113
Yury:  The problem with that approach is that it will simply strip the high order bytes, losing the characters.  The characters will be replaced by the ASCII codepoint 63 which is a "?".   It would be better to save it in UTF-16, if it's subsequently going to be used in a string by VB6.  For Intel based systems:

Dim sw As New System.IO.StreamWriter("test.xml", System.Text.Encoding.Unicode)

PLavelle:  How did you change the encoding to UTF-16?  You can't do this simply by changing the declaration, you must explicitly change the encoding when it's being output.  Post your code, so I can see what's going on.

Regards,
Mike Sharp
0
 

Author Comment

by:PLavelle
ID: 10903154
       Dim iRow As Int32
        Dim iCol As Int32
        Dim sValue As String
        Dim xmlDoc As New XmlDocument
        Dim oElement As XmlElement
        Dim oAttribute As XmlAttribute
        Dim xmlWriter As New XmlTextWriter("c:\test2.xml", System.Text.Encoding.Unicode)

        oElement = xmlDoc.CreateElement("Table")
        oAttribute = xmlDoc.CreateAttribute("Company")
        oAttribute.Value = Constants.Company.Name
        oElement.Attributes.Append(oAttribute)
        oAttribute = xmlDoc.CreateAttribute("Version")
        oAttribute.Value = Constants.Versions.FundTable
        oElement.Attributes.Append(oAttribute)

        xmlDoc.AppendChild(oElement)
        For iRow = 1 To grd.Rows.Count - 1
            oElement = xmlDoc.CreateElement("DocElement")
            For iCol = 0 To grd.Cols.Count - 1
                sValue = Trim(grd(iRow, iCol))
                If Trim(sValue) <> "" Then
                    oAttribute = xmlDoc.CreateAttribute(grd.Cols(iCol).Name)
                    oAttribute.Value = sValue
                    oElement.Attributes.Append(oAttribute)
                End If
            Next
            xmlDoc.DocumentElement.AppendChild(oElement)
        Next

        xmlDoc.Save(xmlWriter)
0
 

Author Comment

by:PLavelle
ID: 10903161
I had to change some names, but there is the code that writes out the XML from the grid.
0
 
LVL 26

Accepted Solution

by:
rdcpro earned 2000 total points
ID: 10903297
Hmmm... I don't see anything wrong there.  I'd say the data is messed up either before this (when the grd is filled out) or during the load in VB6.  You could try ISO-8859-1 for the output encoding...but I'd first try to specify

Dim xmlWriter As New XmlTextWriter("c:\test2.xml", System.Text.Encoding.Default)

This will cause the encoding to be whatever your systems default codepage is, and hopefully the VB6 app will figure it out when it reads the file.

Let me know how this works!

Regards,
Mike Sharp
0
 
LVL 10

Expert Comment

by:Yury_Delendik
ID: 10907704
What are you using to read data in VB6? FileSystem Object ot standard VB statements?
0
 

Author Comment

by:PLavelle
ID: 10929366
The ISO-8859-1 encoding seems to write out the correct character (1 byte). Thanks for all of your help.
0

Featured Post

Efficient way to get backups off site to Azure

This user guide provides instructions on how to deploy and configure both a StoneFly Scale Out NAS Enterprise Cloud Drive virtual machine and Veeam Cloud Connect in the Microsoft Azure Cloud.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Recently while returning home from work my wife (another .NET developer) was murmuring something. On further poking she said that she has been assigned a task where she has to serialize and deserialize objects and she is afraid of serialization. Wha…
Welcome my friends to the second instalment and follow-up to our Minify and Concatenate Your Scripts and Stylesheets (http://www.experts-exchange.com/Programming/Languages/.NET/ASP.NET/A_4334-Minify-and-Concatenate-Your-Scripts-and-Stylesheets.html)…
Have you created a query with information for a calendar? ... and then, abra-cadabra, the calendar is done?! I am going to show you how to make that happen. Visualize your data!  ... really see it To use the code to create a calendar from a q…
How to fix incompatible JVM issue while installing Eclipse While installing Eclipse in windows, got one error like above and unable to proceed with the installation. This video describes how to successfully install Eclipse. How to solve incompa…
Suggested Courses

770 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question