Link to home
Start Free TrialLog in
Avatar of BigRat
BigRatFlag for France

asked on

String to Wide String Problems

I'm not a VB programmer. Somebody has written me a bit of code that returns a VARIANT from a C string as follows :-

   ptr = some_c_function_returning_pointer
and then
   VBFunctionName = CVar(CopyString(ptr))

Clearly this copies the bytes into the variant.

But I access this function from JavaScript via COM and the COM converts the VB variant into a BSTR and in doing so executes an ANSI to Unicode conversion which messes up all bytes whose hex values are between 80 and 9F. Worse, the transformation is Windows version dependant (ie: whether the code page is 1250,1252 etc...)

1) Is there anything I can do to overcome this conversion, so that I get one byte per 16-bits in the Javascript string WITHOUT altering the program

and

2) How would you advise the VB programmer to change the function to overcome the problem BUT remain compatible with existing VB code (ie: calling the function from within VB)

50 grade A cheese for each question, more if there is a really good solution

thanks.
Avatar of Richie_Simonetti
Richie_Simonetti
Flag of Argentina image

The best way to return a string from C++ that could be used for VB with no problems is BSTR type.
if VB6, add additional function call prior to returning the string:


VBFunction = StrConv(strDoubleByte, vbFromUnicode)


This will give you your single byte character string, but it also means other VB programs now use ths single byte character string.  Perhaps add an optional parameter to the COM function call that you can set for your needs, and leave others unaffected, e.g.:


Function VBCOMFunction([Other parameters], Optional ReturnSingleByte As Boolean = False)

   '... other code

   If ReturnSingleByte = True Then
      VBCOMFunction = StrConv(strVariable, vbFromUnicode)
   End If
End Function


If this COM object already has binary compatibility set, and the COM developer doesnt want to break it to add this new optional parameter, he can add an additional property of the object called ReturnSingleByte, that defaults to false, and you can set that as needed (since adding a property wont break the compatibility)
Avatar of BigRat

ASKER

The actual function is declared is :-

Public Property Get RawData() As Variant
  Dim l As Long
  Dim s As Long

  l = ZOOM_record_get(hZOOM_record, "raw", s)
 
  RawData = CVar(CopyString(l))

End Property

and I suppose that CVar produces a Variant containing an 8-bit string.

I presume that the function "StrConv(strVariable,vbFromUnicode)" is meant to convert the 16-bit Unicode back into 8-bit characters. This transformation won't work since the conversion from Windows1252 to Unicode is NOT one-to-one. Characters (decimal) 129,141,143,144 and 157 have no mapping!

The function CopyString looks like :-

Function CopyString(ptr As Long) As String
  Dim l As Long
  Dim ret As String
 
 
  If ptr = 0 Then
    ret = ""
  Else
    l = lstrlen(ptr)
    ret = String(l, vbNullChar)
   
    l = lstrcpy(ret, ptr)
  End If
 
  CopyString = ret
End Function


and I suspect that I will have to have an additional function RawDataW which calls a CopyStringW which gets the 8-bit bytes from the ptr variable into a wide character string WITHOUT any conversion.

Sounds sensible? If so how?
1. What about returning array of bytes form the VB function?

or

2. change the last line as the following:

VBFunctionName = CStr(CVar(CopyString(ptr)))

HTH
In Jscript you can try to use
charCodeAt(index) method of the String Object to retrieve character codes for each character.
Avatar of BigRat

ASKER

zlatev:

1. How? I'm NOT a VB programmer. And what is the mapping in JScript (in IE)

2. Does this widen the string WEITHOUT doing any character conversions?

PS: I'm going away for three weeks holiday. Back on 23 September.

1. Have no spare time to code it for you now.
2. Yes.

CU l8r
ASKER CERTIFIED SOLUTION
Avatar of AzraSound
AzraSound
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
An example of calling the WideCharToMultiByte API from VB:
http://www.vbcity.com/forums/topic.asp?tid=1881
@BigRat, Do you still need help with this?
If so can you provide some working dummy code example (I mean something I can compile and debug to check what happens).

According to me it can also be unicode codings problem.
Note that there are at least 4 different unicode coddings:
UCS-2, UCS-4, UTF-8, UTF-16
Some of them like UCS-2 and UCS-4 has big-endian and little-endian variations.
See: http://czyborra.com/utf/ for more info.

If problem is with lack of support for a particular unicode encoding, my suggestion will be to do conversion to another unicode format.

Kind Regards,
Zlatin Zlatev
Avatar of BigRat

ASKER

I'd completely forgot to follow this one up, since none of the answers directly helped me.

The situation has however changed. The object in question is the Visual Basic (ActiveX) implementation of the Zoom standard (www.zoom.org). A similar problem occurs in Java when the []byte returned from a Z.39.50 server gets converted to String type. Importantly the data is BYTE and not CHAR so any converion into a CHAR-type object must be done without resource to any translation table, just simply widening (extending to 16 bits with zeros).

The Zoom standard has been changed to give a []byte interface, and this I now use without any problems.

That was one of those silly problems where the standard was incomplete. You know that you're getting a certain type of data back, but you've no idea how it is coded. HTML had this problem at the beginning. Now you know that the document DOM is Unicode, the text/* mime-type implies iso-8859-1 unless qualified by a charset= attribute. In spite of this you still see problems (as Roman Czyborra says!) on the Web.

Now my problem is who to give the points to, in particular since I stipulated :-

2. Does this widen the string WITHOUT doing any character conversions?


<recommendation>
?????
</recommendation>
@BigRat, from me, there is no problem if you decide to close this as PAQ with refund.
As for "2. Does this widen the string WITHOUT doing any character conversions?" I may not be sure till I have tested it. According to me it may or may not do character convertions - it should be carefully checked what we have in memory. String is not always represented the same way in memory on different languages/language versions/platforms.
Avatar of BigRat

ASKER

I'll give the cheese to AzraSDound, your cheese is posted as new question