Want to win a PS4? Go Premium and enter to win our High-Tech Treats giveaway. Enter to Win

x
?
Solved

Encoding ASCII characters 128-255 as character references

Posted on 2004-04-23
13
Medium Priority
?
1,090 Views
Last Modified: 2006-11-17
How can I use the XMLDocument object to encode characters in the ascii range 128-255 as character references (such as ¿) rather than the two byte representation of the character?
0
Comment
Question by:PLavelle
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 7
  • 4
  • 2
13 Comments
 
LVL 26

Expert Comment

by:rdcpro
ID: 10902547
You can't.  In fact, if you start out with character references in an XML object, they will be parsed into their characters, and  unless it's a markup character, or a character that should be escaped (according to HTML), it will be output as the character itself.  For example, the non-breaking space character, &#160; is parsed, and output as the actual character...it's not re-escaped.  In an XSLT transformation, some characters will be escaped, depending on whether you set the <xsl:output method="html" />

It sounds to me like you're experiencing an encoding problem, which can come from a number of different places.  If you're seeing two-byte characters, then I'm betting you're doing something where the XML (or the output of the transform) gets converted to a string.  This will force UTF-16, which is a switch in encoding, and can cause the wrong characters to be rendered. What exactly is going wrong?  

Regards,
Mike Sharp
0
 

Author Comment

by:PLavelle
ID: 10902619
I am using an XMLDocument object in .NET to write out the XML from a .NET application. When I read the XML into a VB6 application, the string contains two characters for every character over 127 in the ascii table. For instance, ¿ is replaced by ¿ .
0
 

Author Comment

by:PLavelle
ID: 10902656
Also, the XML is read into VB6 as a string, not using the DOM.
0
 [eBook] Windows Nano Server

Download this FREE eBook and learn all you need to get started with Windows Nano Server, including deployment options, remote management
and troubleshooting tips and tricks

 
LVL 26

Expert Comment

by:rdcpro
ID: 10902779
Strings are always UTF-16.  This is probably the issue you're having.  VB6 thinks the xml is a different encoding because of the xml declaration...or some other similar reason.   If you're going to use strings at any point, you need to make sure the file is written out as UTF-16.  In an XSLT, you could do this with:

<xsl:output method="xml" encoding="utf-16" />

Can you post some of the code that you use to write out the XML in .NET, as well as a snippet of the XML (including the declaration, if present)?  


Regards,
Mike Sharp
0
 

Author Comment

by:PLavelle
ID: 10902896
I changed it to UTF-16 encoding and it still writes out two byte characters. VB6 interprets these two byte characters as two characters instead of just one.
0
 
LVL 10

Expert Comment

by:Yury_Delendik
ID: 10902960
Try this

Dim sw As New System.IO.StreamWriter("test.xml", System.Text.Encoding.ASCII)
xmldoc.Save(sw)
sw.Close()
0
 

Author Comment

by:PLavelle
ID: 10903036
Yury, that just replaces those characters with question marks.
0
 
LVL 26

Expert Comment

by:rdcpro
ID: 10903113
Yury:  The problem with that approach is that it will simply strip the high order bytes, losing the characters.  The characters will be replaced by the ASCII codepoint 63 which is a "?".   It would be better to save it in UTF-16, if it's subsequently going to be used in a string by VB6.  For Intel based systems:

Dim sw As New System.IO.StreamWriter("test.xml", System.Text.Encoding.Unicode)

PLavelle:  How did you change the encoding to UTF-16?  You can't do this simply by changing the declaration, you must explicitly change the encoding when it's being output.  Post your code, so I can see what's going on.

Regards,
Mike Sharp
0
 

Author Comment

by:PLavelle
ID: 10903154
       Dim iRow As Int32
        Dim iCol As Int32
        Dim sValue As String
        Dim xmlDoc As New XmlDocument
        Dim oElement As XmlElement
        Dim oAttribute As XmlAttribute
        Dim xmlWriter As New XmlTextWriter("c:\test2.xml", System.Text.Encoding.Unicode)

        oElement = xmlDoc.CreateElement("Table")
        oAttribute = xmlDoc.CreateAttribute("Company")
        oAttribute.Value = Constants.Company.Name
        oElement.Attributes.Append(oAttribute)
        oAttribute = xmlDoc.CreateAttribute("Version")
        oAttribute.Value = Constants.Versions.FundTable
        oElement.Attributes.Append(oAttribute)

        xmlDoc.AppendChild(oElement)
        For iRow = 1 To grd.Rows.Count - 1
            oElement = xmlDoc.CreateElement("DocElement")
            For iCol = 0 To grd.Cols.Count - 1
                sValue = Trim(grd(iRow, iCol))
                If Trim(sValue) <> "" Then
                    oAttribute = xmlDoc.CreateAttribute(grd.Cols(iCol).Name)
                    oAttribute.Value = sValue
                    oElement.Attributes.Append(oAttribute)
                End If
            Next
            xmlDoc.DocumentElement.AppendChild(oElement)
        Next

        xmlDoc.Save(xmlWriter)
0
 

Author Comment

by:PLavelle
ID: 10903161
I had to change some names, but there is the code that writes out the XML from the grid.
0
 
LVL 26

Accepted Solution

by:
rdcpro earned 2000 total points
ID: 10903297
Hmmm... I don't see anything wrong there.  I'd say the data is messed up either before this (when the grd is filled out) or during the load in VB6.  You could try ISO-8859-1 for the output encoding...but I'd first try to specify

Dim xmlWriter As New XmlTextWriter("c:\test2.xml", System.Text.Encoding.Default)

This will cause the encoding to be whatever your systems default codepage is, and hopefully the VB6 app will figure it out when it reads the file.

Let me know how this works!

Regards,
Mike Sharp
0
 
LVL 10

Expert Comment

by:Yury_Delendik
ID: 10907704
What are you using to read data in VB6? FileSystem Object ot standard VB statements?
0
 

Author Comment

by:PLavelle
ID: 10929366
The ISO-8859-1 encoding seems to write out the correct character (1 byte). Thanks for all of your help.
0

Featured Post

NFR key for Veeam Agent for Linux

Veeam is happy to provide a free NFR license for one year.  It allows for the non‑production use and valid for five workstations and two servers. Veeam Agent for Linux is a simple backup tool for your Linux installations, both on‑premises and in the public cloud.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Summary Displaying images in RichTextBox is a common requirement with limited solutions available. Pasting through clipboard or embedding into RTF content only support static images.  This article describes how to insert Windows control objects int…
A long time ago (May 2011), I have written an article showing you how to create a DLL using Visual Studio 2005 to be hosted in SQL Server 2005. That was valid at that time and it is still valid if you are still using these versions. You can still re…
Visualize your data even better in Access queries. Given a date and a value, this lesson shows how to compare that value with the previous value, calculate the difference, and display a circle if the value is the same, an up triangle if it increased…
Sometimes it takes a new vantage point, apart from our everyday security practices, to truly see our Active Directory (AD) vulnerabilities. We get used to implementing the same techniques and checking the same areas for a breach. This pattern can re…

609 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question