tech_question
asked on
problems with encoding with MSXML2.XMLHTTP object ?
we have an htm page, which gets data via vb business components. The data is bound to xml data islands in html page.
When users add a record or edit an existing record with special characters eg: ®"£ these are getting lost or converted
to weird characters. Eg £ gets converted to B#.
environment: windowsers, IIS, VB 6.0. We are using the send method of XMLHTTP to send the xml data islands via post
to the VB object. The data in the vb object contains these weird characters , I am guess it is with encoding .....
any ideas as to how to fix this would really be helpful !!
When users add a record or edit an existing record with special characters eg: ®"£ these are getting lost or converted
to weird characters. Eg £ gets converted to B#.
environment: windowsers, IIS, VB 6.0. We are using the send method of XMLHTTP to send the xml data islands via post
to the VB object. The data in the vb object contains these weird characters , I am guess it is with encoding .....
any ideas as to how to fix this would really be helpful !!
What is the encoding of the web page?
ASKER
there is no encoding we use in the web page not even in the meta tag. do we have to ? and if we so will it solve this problem ?
At this point I'm trying to get a handle on what's happening to see if I can recreate it.
>£ gets converted to B#.
That's a very strange translation that I wouldn't expect from an encoding error. Can you post the code used to send the XML data?
>£ gets converted to B#.
That's a very strange translation that I wouldn't expect from an encoding error. Can you post the code used to send the XML data?
ASKER
we have this application running in two modes - one app mode and the other webmode. In app mode this is the result for £: These results are in the business component not in the front end xml. The front end xml characters look good. Its in the business component that gets messed up . For the first part in the app mode, we have found a fix to conver the UTF 16 to BSTR encoding. what is the webmode doing - I have no clue !! would appending encoding = UTF- 8 to the header solve my problem ?
<?xml version="1.0"?>
<data><test>NEERAJ MEATLOAF TEST £</test></data>
In webmode this is the result
<?xml version="1.0"?>
<data><test>NEERAJ MEATLOAF TEST B#</test></data>
<?xml version="1.0"?>
<data><test>NEERAJ MEATLOAF TEST £</test></data>
In webmode this is the result
<?xml version="1.0"?>
<data><test>NEERAJ MEATLOAF TEST B#</test></data>
If I understand what you're saying, yes, adding the encoding scheme to the HTML will cause the data to be displayed correctly.
Try adding this tag to the head of your HTML page:
<meta http-equiv="content-type" content="text/html; charset=UTF-8" />
Try adding this tag to the head of your HTML page:
<meta http-equiv="content-type" content="text/html; charset=UTF-8" />
ASKER
Paul thanks for your post , I am heading for lunch. I am going to try this and give the feedback. Just wondering what the implication of adding this would be ? fyi our software could go international , would this support chinese characters etcc.....?
ASKER
Paul that did not work , I also added this to my XMLHTTP object myXMLhttp.setRequestHeader ("Content- Type", "text/xml; charset=utf-8"); - did not work , I get the same above conversions !! any more ideas ?
Not without knowing the structure of the app, seeing some of the code.
ASKER
html page:
----------
<xml id="xmlIngredients"></xml>
<html>
<body>
<input id="inptestName" class="inputBox" datafld="test" maxlength="60" value="" size="65" NAME="inptestName" />
</body>
</html>
javascript functions included in the htm page:
-------------------------- ---------- ---------- -
function save ()
{
var testObj = new testStream();
var sampleXMLHTTP = new ActiveXObject("MSXML2.XMLH TTP");
data = xmlTest.xml;
araObj.addData(data);
//return;
sampleXMLHTTP.open("POST", getObjectPath() + "?obj=RCP_PageHelper.Recip eMaintenan ce&meth=sa veRecipe&p arams=" + RecipeID, false);
//this was not present before just now
sampleXMLHTTP.setRequestHe ader("Cont ent-Type", "text/xml; charset=utf-8");
sampleXMLHTTP.send(testObj .testStrea m());
}
the testStream() object is our own function which creates a MSXML2.DOMDocument adds
certain tags to the xml and the sends the following xml to the vb object. The output
pf testObj.testStream() is the below :
<?xml version="1.0"?>
<data><test>My TEST £</test></data>
hope this helps , please let me know if you need any clarifications. ?
----------
<xml id="xmlIngredients"></xml>
<html>
<body>
<input id="inptestName" class="inputBox" datafld="test" maxlength="60" value="" size="65" NAME="inptestName" />
</body>
</html>
javascript functions included in the htm page:
--------------------------
function save ()
{
var testObj = new testStream();
var sampleXMLHTTP = new ActiveXObject("MSXML2.XMLH
data = xmlTest.xml;
araObj.addData(data);
//return;
sampleXMLHTTP.open("POST",
//this was not present before just now
sampleXMLHTTP.setRequestHe
sampleXMLHTTP.send(testObj
}
the testStream() object is our own function which creates a MSXML2.DOMDocument adds
certain tags to the xml and the sends the following xml to the vb object. The output
pf testObj.testStream() is the below :
<?xml version="1.0"?>
<data><test>My TEST £</test></data>
hope this helps , please let me know if you need any clarifications. ?
Where is the translation problem occurring? In the browser or on the server?
If on the server, what codepage does the server use?
>the testStream() object is our own function
The default encoding of an xml document will be utf8 This explains why you are seeing
<data><test>NEERAJ MEATLOAF TEST £</test></data>
This is typical of UTF8 being transformed to US codepage ASCII. However, the transformation of the same data to B# indicates a transformation to a different code page... It looks to me like Shift JIS--a Japanese language codepage.
If on the server, what codepage does the server use?
>the testStream() object is our own function
The default encoding of an xml document will be utf8 This explains why you are seeing
<data><test>NEERAJ MEATLOAF TEST £</test></data>
This is typical of UTF8 being transformed to US codepage ASCII. However, the transformation of the same data to B# indicates a transformation to a different code page... It looks to me like Shift JIS--a Japanese language codepage.
ASKER
paul the problem is on the server (business component) which is in VB 6.0. The data looks good before it goes to the server both in app mode and webmode. My guess is that VB 6.0 uses BSTR unicode codebase - am I right ? We have not changed anything which is default in VB. can you please give an insight as to why £ is being converted to B# in webmode ?
ASKER
Just as another example In webmode é gets converted to C) but in App mode é gets converted to é
ASKER
I do not where those junk characters have come from, this what the below other example I see :
Eg:In webmode é gets converted to C) (this is the converted character before it goes to the below function) but in App mode é gets converted to é
Eg:In webmode é gets converted to C) (this is the converted character before it goes to the below function) but in App mode é gets converted to é
ASKER
paul , please ignore my last two posts, this is the example I wanted to share :
Eg:in Webmode é gets converted to C) but in App mode é gets converted to é
Eg:in Webmode é gets converted to C) but in App mode é gets converted to é
ASKER CERTIFIED SOLUTION
membership
Create a free account to see this answer
Signing up is free and takes 30 seconds. No credit card required.
ASKER
I checked the Code Language - it is set to UnitedStates (Language)
RegionalandlanguageOptions -> Advanced -> UnitedStates (Lanauge). The bottom to this there are a couple of check boxes checked. At the bottom of this , there is a check box which is unchecked which pertains to default user account settings. Do I need to change anything here ?
RegionalandlanguageOptions
Server or Client? Both the same?
ASKER
sorry both the same, I am running my local computer as localhost which is the server.
Any way you can make a sample app that displays the same problem for upload to ee-stuff.com? Just extract the bare minimum that displays the same problem.
Without being able to replicate the problem on my own machine, I'm just shooting blanks in the dark.
Without being able to replicate the problem on my own machine, I'm just shooting blanks in the dark.
ASKER
Paul so do you mean you cannot replicate this problem despite using VB , XMLHTTP object - do you think there are some default settings on my machine or some problem with the frameworks that we are using ?
ASKER
Paul - ok there was a controller which had US-ASCII setting we changed that to UTF-8 , it works great webmode. but still in AppMode £ gets converted to £. We are putting a temp fix and trying to do this untill we find a permanent solution. any ideas how we can fix it in AppMode ?
Try this (note the declaration of MultibyteToWideChar is important, don't change it):
Option Explicit
Private Declare Function MultiByteToWideChar Lib "kernel32" (ByVal CodePage As Long, ByVal dwFlags As Long, ByRef lpMultiByteStr As Any, ByVal cchMultiByte As Long, ByVal lpWideCharStr As Long, ByVal cchWideChar As Long) As Long
Private Const CP_UTF8 As Long = 65001
Private Const MB_PRECOMPOSED As Long = &H1
Private Sub Command1_Click()
Dim strUTF8 As String
strUTF8 = "£¢ab"
MsgBox Utf8ToAscii(strUTF8)
End Sub
Public Function Utf8ToAscii(ByVal strM As String) As String
Dim lngMSize As Long, lngWSize As Long
Dim strWide As String, bytUtf8() As Byte
Dim lngRes As Long
If LenB(strM) = 0 Then Exit Function
bytUtf8 = StrConv(strM, vbFromUnicode)
lngMSize = UBound(bytUtf8) + 1
lngWSize = lngMSize * 2
strWide = String$(lngWSize, vbNullChar)
lngRes = MultiByteToWideChar(CP_UTF 8, 0, bytUtf8(0), lngMSize, StrPtr(strWide), lngWSize)
Utf8ToAscii = Left$(strWide, lngRes)
End Function
Option Explicit
Private Declare Function MultiByteToWideChar Lib "kernel32" (ByVal CodePage As Long, ByVal dwFlags As Long, ByRef lpMultiByteStr As Any, ByVal cchMultiByte As Long, ByVal lpWideCharStr As Long, ByVal cchWideChar As Long) As Long
Private Const CP_UTF8 As Long = 65001
Private Const MB_PRECOMPOSED As Long = &H1
Private Sub Command1_Click()
Dim strUTF8 As String
strUTF8 = "£¢ab"
MsgBox Utf8ToAscii(strUTF8)
End Sub
Public Function Utf8ToAscii(ByVal strM As String) As String
Dim lngMSize As Long, lngWSize As Long
Dim strWide As String, bytUtf8() As Byte
Dim lngRes As Long
If LenB(strM) = 0 Then Exit Function
bytUtf8 = StrConv(strM, vbFromUnicode)
lngMSize = UBound(bytUtf8) + 1
lngWSize = lngMSize * 2
strWide = String$(lngWSize, vbNullChar)
lngRes = MultiByteToWideChar(CP_UTF
Utf8ToAscii = Left$(strWide, lngRes)
End Function