BigRat
asked on
String to Wide String Problems
I'm not a VB programmer. Somebody has written me a bit of code that returns a VARIANT from a C string as follows :-
ptr = some_c_function_returning_ pointer
and then
VBFunctionName = CVar(CopyString(ptr))
Clearly this copies the bytes into the variant.
But I access this function from JavaScript via COM and the COM converts the VB variant into a BSTR and in doing so executes an ANSI to Unicode conversion which messes up all bytes whose hex values are between 80 and 9F. Worse, the transformation is Windows version dependant (ie: whether the code page is 1250,1252 etc...)
1) Is there anything I can do to overcome this conversion, so that I get one byte per 16-bits in the Javascript string WITHOUT altering the program
and
2) How would you advise the VB programmer to change the function to overcome the problem BUT remain compatible with existing VB code (ie: calling the function from within VB)
50 grade A cheese for each question, more if there is a really good solution
thanks.
ptr = some_c_function_returning_
and then
VBFunctionName = CVar(CopyString(ptr))
Clearly this copies the bytes into the variant.
But I access this function from JavaScript via COM and the COM converts the VB variant into a BSTR and in doing so executes an ANSI to Unicode conversion which messes up all bytes whose hex values are between 80 and 9F. Worse, the transformation is Windows version dependant (ie: whether the code page is 1250,1252 etc...)
1) Is there anything I can do to overcome this conversion, so that I get one byte per 16-bits in the Javascript string WITHOUT altering the program
and
2) How would you advise the VB programmer to change the function to overcome the problem BUT remain compatible with existing VB code (ie: calling the function from within VB)
50 grade A cheese for each question, more if there is a really good solution
thanks.
The best way to return a string from C++ that could be used for VB with no problems is BSTR type.
if VB6, add additional function call prior to returning the string:
VBFunction = StrConv(strDoubleByte, vbFromUnicode)
This will give you your single byte character string, but it also means other VB programs now use ths single byte character string. Perhaps add an optional parameter to the COM function call that you can set for your needs, and leave others unaffected, e.g.:
Function VBCOMFunction([Other parameters], Optional ReturnSingleByte As Boolean = False)
'... other code
If ReturnSingleByte = True Then
VBCOMFunction = StrConv(strVariable, vbFromUnicode)
End If
End Function
If this COM object already has binary compatibility set, and the COM developer doesnt want to break it to add this new optional parameter, he can add an additional property of the object called ReturnSingleByte, that defaults to false, and you can set that as needed (since adding a property wont break the compatibility)
VBFunction = StrConv(strDoubleByte, vbFromUnicode)
This will give you your single byte character string, but it also means other VB programs now use ths single byte character string. Perhaps add an optional parameter to the COM function call that you can set for your needs, and leave others unaffected, e.g.:
Function VBCOMFunction([Other parameters], Optional ReturnSingleByte As Boolean = False)
'... other code
If ReturnSingleByte = True Then
VBCOMFunction = StrConv(strVariable, vbFromUnicode)
End If
End Function
If this COM object already has binary compatibility set, and the COM developer doesnt want to break it to add this new optional parameter, he can add an additional property of the object called ReturnSingleByte, that defaults to false, and you can set that as needed (since adding a property wont break the compatibility)
ASKER
The actual function is declared is :-
Public Property Get RawData() As Variant
Dim l As Long
Dim s As Long
l = ZOOM_record_get(hZOOM_reco rd, "raw", s)
RawData = CVar(CopyString(l))
End Property
and I suppose that CVar produces a Variant containing an 8-bit string.
I presume that the function "StrConv(strVariable,vbFro mUnicode)" is meant to convert the 16-bit Unicode back into 8-bit characters. This transformation won't work since the conversion from Windows1252 to Unicode is NOT one-to-one. Characters (decimal) 129,141,143,144 and 157 have no mapping!
The function CopyString looks like :-
Function CopyString(ptr As Long) As String
Dim l As Long
Dim ret As String
If ptr = 0 Then
ret = ""
Else
l = lstrlen(ptr)
ret = String(l, vbNullChar)
l = lstrcpy(ret, ptr)
End If
CopyString = ret
End Function
and I suspect that I will have to have an additional function RawDataW which calls a CopyStringW which gets the 8-bit bytes from the ptr variable into a wide character string WITHOUT any conversion.
Sounds sensible? If so how?
Public Property Get RawData() As Variant
Dim l As Long
Dim s As Long
l = ZOOM_record_get(hZOOM_reco
RawData = CVar(CopyString(l))
End Property
and I suppose that CVar produces a Variant containing an 8-bit string.
I presume that the function "StrConv(strVariable,vbFro
The function CopyString looks like :-
Function CopyString(ptr As Long) As String
Dim l As Long
Dim ret As String
If ptr = 0 Then
ret = ""
Else
l = lstrlen(ptr)
ret = String(l, vbNullChar)
l = lstrcpy(ret, ptr)
End If
CopyString = ret
End Function
and I suspect that I will have to have an additional function RawDataW which calls a CopyStringW which gets the 8-bit bytes from the ptr variable into a wide character string WITHOUT any conversion.
Sounds sensible? If so how?
1. What about returning array of bytes form the VB function?
or
2. change the last line as the following:
VBFunctionName = CStr(CVar(CopyString(ptr)) )
HTH
or
2. change the last line as the following:
VBFunctionName = CStr(CVar(CopyString(ptr))
HTH
In Jscript you can try to use
charCodeAt(index) method of the String Object to retrieve character codes for each character.
charCodeAt(index) method of the String Object to retrieve character codes for each character.
ASKER
zlatev:
1. How? I'm NOT a VB programmer. And what is the mapping in JScript (in IE)
2. Does this widen the string WEITHOUT doing any character conversions?
PS: I'm going away for three weeks holiday. Back on 23 September.
1. How? I'm NOT a VB programmer. And what is the mapping in JScript (in IE)
2. Does this widen the string WEITHOUT doing any character conversions?
PS: I'm going away for three weeks holiday. Back on 23 September.
1. Have no spare time to code it for you now.
2. Yes.
CU l8r
2. Yes.
CU l8r
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
An example of calling the WideCharToMultiByte API from VB:
http://www.vbcity.com/forums/topic.asp?tid=1881
http://www.vbcity.com/forums/topic.asp?tid=1881
@BigRat, Do you still need help with this?
If so can you provide some working dummy code example (I mean something I can compile and debug to check what happens).
According to me it can also be unicode codings problem.
Note that there are at least 4 different unicode coddings:
UCS-2, UCS-4, UTF-8, UTF-16
Some of them like UCS-2 and UCS-4 has big-endian and little-endian variations.
See: http://czyborra.com/utf/ for more info.
If problem is with lack of support for a particular unicode encoding, my suggestion will be to do conversion to another unicode format.
Kind Regards,
Zlatin Zlatev
If so can you provide some working dummy code example (I mean something I can compile and debug to check what happens).
According to me it can also be unicode codings problem.
Note that there are at least 4 different unicode coddings:
UCS-2, UCS-4, UTF-8, UTF-16
Some of them like UCS-2 and UCS-4 has big-endian and little-endian variations.
See: http://czyborra.com/utf/ for more info.
If problem is with lack of support for a particular unicode encoding, my suggestion will be to do conversion to another unicode format.
Kind Regards,
Zlatin Zlatev
ASKER
I'd completely forgot to follow this one up, since none of the answers directly helped me.
The situation has however changed. The object in question is the Visual Basic (ActiveX) implementation of the Zoom standard (www.zoom.org). A similar problem occurs in Java when the []byte returned from a Z.39.50 server gets converted to String type. Importantly the data is BYTE and not CHAR so any converion into a CHAR-type object must be done without resource to any translation table, just simply widening (extending to 16 bits with zeros).
The Zoom standard has been changed to give a []byte interface, and this I now use without any problems.
That was one of those silly problems where the standard was incomplete. You know that you're getting a certain type of data back, but you've no idea how it is coded. HTML had this problem at the beginning. Now you know that the document DOM is Unicode, the text/* mime-type implies iso-8859-1 unless qualified by a charset= attribute. In spite of this you still see problems (as Roman Czyborra says!) on the Web.
Now my problem is who to give the points to, in particular since I stipulated :-
2. Does this widen the string WITHOUT doing any character conversions?
<recommendation>
?????
</recommendation>
The situation has however changed. The object in question is the Visual Basic (ActiveX) implementation of the Zoom standard (www.zoom.org). A similar problem occurs in Java when the []byte returned from a Z.39.50 server gets converted to String type. Importantly the data is BYTE and not CHAR so any converion into a CHAR-type object must be done without resource to any translation table, just simply widening (extending to 16 bits with zeros).
The Zoom standard has been changed to give a []byte interface, and this I now use without any problems.
That was one of those silly problems where the standard was incomplete. You know that you're getting a certain type of data back, but you've no idea how it is coded. HTML had this problem at the beginning. Now you know that the document DOM is Unicode, the text/* mime-type implies iso-8859-1 unless qualified by a charset= attribute. In spite of this you still see problems (as Roman Czyborra says!) on the Web.
Now my problem is who to give the points to, in particular since I stipulated :-
2. Does this widen the string WITHOUT doing any character conversions?
<recommendation>
?????
</recommendation>
@BigRat, from me, there is no problem if you decide to close this as PAQ with refund.
As for "2. Does this widen the string WITHOUT doing any character conversions?" I may not be sure till I have tested it. According to me it may or may not do character convertions - it should be carefully checked what we have in memory. String is not always represented the same way in memory on different languages/language versions/platforms.
ASKER
I'll give the cheese to AzraSDound, your cheese is posted as new question