Solved

Strange behavior with WideString conversion

Posted on 2002-04-15
21
558 Views
Last Modified: 2010-04-04
procedure TForm1.TisButton1Click(Sender: TObject);
  procedure test(const Param1: WideString);
  var
    s,d:string;
    i,len:integer;
  begin
    s:=param1;
    len:=length(s);
    d:=Inttostr(len)+': ';
    for i:=1 to len do
      d:=d+' '+inttostr(ord(s[i]));
    ShowMessage(d);
  end;
var
  s:string;
begin
  s:=#253#253#253#02#25;
  test(s);
end;

After call test, the result is #253#253#63#25, Why?

Thanks.
0
Comment
Question by:HBZhang
  • 11
  • 5
  • 3
  • +2
21 Comments
 
LVL 1

Expert Comment

by:MBo
ID: 6943694
I've tried.
Result- 5:253 253 253 2 25
It looks like localization problem?
Is your Windows English? (My is Russian)
0
 

Author Comment

by:HBZhang
ID: 6943710
No, my is Chinese.
0
 
LVL 1

Expert Comment

by:Alone
ID: 6943941
Hi!

Your error is in this line: s:=param1;

Because Param1 is WideString and S is string that line is equivalent of WideCharToString (or similar) function call. This conversion depends on your system locale settings and behavior differs between Russian and Chinese locales :-))

Try this:

procedure TMainForm.Button1Click(Sender: TObject);

procedure Test(const Param: WideString);
var
  S: string;
  I, Len: Integer;
  P: Pointer;
begin
  Len := Length(Param) * SizeOf(Param[1]); // size in bytes
  S := IntToStr(Len) + ',';
  for I := 0 to Len - 1 do
    S := S + '#' + IntToStr(Byte(AnsiString(Pointer(Param))[I]));
  ShowMessage(S);
end;

begin
  Test(#204#224#236#224);
// shows 8,#0#28#4#48#4#60#4#48  
end;

My system is Russian too :-))))


0
Complete VMware vSphere® ESX(i) & Hyper-V Backup

Capture your entire system, including the host, with patented disk imaging integrated with VMware VADP / Microsoft VSS and RCT. RTOs is as low as 15 seconds with Acronis Active Restore™. You can enjoy unlimited P2V/V2V migrations from any source (even from a different hypervisor)

 

Author Comment

by:HBZhang
ID: 6943996
I'm call a COM object function with a string as parameter. Because COM only support WideString, then problem occurs.

Is there a way to exchange data between String and WideString safely?

Thanks.
0
 
LVL 1

Expert Comment

by:Alone
ID: 6944037
For Chinese? For what purpose?

var
  S1, S2: string;
  W: WideString;
begin
  S1 := 'blabla';  // ANSI 'blabla'
  W := S1;         // ANSI #0'b'#0'l'#0'a'#0'b'#0'l'#0'a'
  S2 := W;         // ANSI 'blabla'
end;

Don't cofuse with ANSI (may be multibyte) and Unicode characters. They are MUST BE different! My previous example shows how Unicode string with Russian characters looks at low-level (bytes chain).

For european languages (and Russian)  ANSI strings are always single-byte. When we assigning them to WideString, they're expanding to double-byte Unicode characters, for my example, 4 bytes (4 single-byte characters) to 8 bytes (4 double-byte characters).

How many Chinese characters are in your sample: #253#253#253#02#25?

Please try current example (blablabla). If strings are not corrupted your system still is working ok. :-))
0
 
LVL 1

Expert Comment

by:Alone
ID: 6944048
For Chinese? For what purpose?

var
  S1, S2: string;
  W: WideString;
begin
  S1 := 'blabla';  // ANSI 'blabla'
  W := S1;         // ANSI #0'b'#0'l'#0'a'#0'b'#0'l'#0'a'
  S2 := W;         // ANSI 'blabla'
end;

Don't cofuse with ANSI (may be multibyte) and Unicode characters. They are MUST BE different! My previous example shows how Unicode string with Russian characters looks at low-level (bytes chain).

For european languages (and Russian)  ANSI strings are always single-byte. When we assigning them to WideString, they're expanding to double-byte Unicode characters, for my example, 4 bytes (4 single-byte characters) to 8 bytes (4 double-byte characters).

How many Chinese characters are in your sample: #253#253#253#02#25?

Please try current example (blablabla). If strings are not corrupted your system still is working ok. :-))
0
 
LVL 1

Expert Comment

by:Alone
ID: 6944050
Sorry my message sent twice :-((
0
 

Author Comment

by:HBZhang
ID: 6944068
Create a variant array with varByte type can avoid this problem, but it's no simple.

I wonder to know is there something can control this conversion? Anyway, i think it as strange behavior.

  WideString := String;
  String := WideString;

Changed? Why?
0
 
LVL 1

Expert Comment

by:Alone
ID: 6944089
For Russian:

 WideString := String;
 String := WideString;

works fine.

But for Azeri no:

W := 'az'#609'ri';
S := W; // looks as 'az?ri';
W := 'S' // looks as 'az?ri'; but in Unicode :-((

This behavior depends on ANSI (single-byte in my case) character set restriction. Some Unicode characters has no equivalent in single-byte and system replace them with '?'

Am I right?
0
 
LVL 1

Expert Comment

by:Alone
ID: 6944125
Yep! When you place strings direclty in your program source Delphi ALWAYS creates ANSI strings.

procedure TMainForm.Button2Click(Sender: TObject);

function ComFunction(const Param: OleVariant): Integer;
begin
// Works with NT/2k/XP only ;-)
  Result := MessageBoxW(Handle, Pointer(WideString(Param)), '', MB_ICONINFORMATION);
end;

var
  S: string;
  W: WideString;
  C: WideChar;

begin
  ComFunction('Direct: az'#$018F'ri'); // Delphi creates an ANSI (?) string 'az'#$8F'ri'
// may be Unicode string BUT depends on system locale (my locale is Russian but string is in Azeri (Azerbaijani))
  C := #$018F;
  ComFunction('WideChar: Az'+C+'ri'); // works fine
end;
0
 

Author Comment

by:HBZhang
ID: 6944165
Thanks. But it's not good enough. I'am waiting...

BTW: In my really work, I donnot "place strings direclty in your program source". It comes from rs232 port.
0
 
LVL 1

Expert Comment

by:Alone
ID: 6944181
Here is your sample like:

var
  W: WideString;

begin
  W := 'az'#$018F'ri'; // Delphi creates #$0041#$007A + #$040F + #0072#0069
// instead of #$0041#$007A + #$018F+ #0072#0069
// third characted replaced with Russian (using system locale)
end;
0
 
LVL 1

Accepted Solution

by:
Alone earned 300 total points
ID: 6944189
If string "comes from RS232" try to receive it into WideString variable and NEVER covert it into single-byte. Always use WideString.
0
 

Author Comment

by:HBZhang
ID: 6944205
Yeah, always use WideString may be a good idea.
0
 
LVL 12

Expert Comment

by:Lee_Nover
ID: 6944714
why not simply use StringToOLEStr :)
0
 
LVL 17

Expert Comment

by:geobul
ID: 6944757
Hi,

procedure test(const Param1: WideString);
var
  s: string;
  wc: PWideChar;
begin
  wc := PWideChar(Param1);
  s := WideCharToString(wc);
  s := IntToStr(Length(s)) + ': ' + s;
  ShowMessage(s);
end;

procedure TForm1.Button1Click(Sender: TObject);
var
  ws: WideString;
begin
  ws := #253#253#253#02#25;
  test(ws);
end;

Regards, Geo
0
 
LVL 17

Expert Comment

by:geobul
ID: 6944820
Or:

procedure TForm1.TisButton1Click(Sender: TObject);
  procedure test(const Param1: WideString);
  var
    s,d: string;
    wc: PWideChar;
    len,i: integer;
  begin
    wc := PWideChar(Param1);
    s := WideCharToString(wc);
    len := length(s);
    d := IntToStr(len) + ': ';
    for i := 1 to len do
      d := d + '#' + inttostr(ord(s[i]));
    ShowMessage(d);
  end;
var
 s:string;
begin
 s:=#253#253#253#02#25;
 test(s);
end;
0
 
LVL 1

Expert Comment

by:Alone
ID: 6944895
All depends on receiving strings original format: ANSI or Unicode. When they're ANSI, may possible to use AnsiString and StringToOleStr or direct StrOleVariant assignment.
But when string is Unicode, converting to ANSI representation may corrupt the data, replacing some characters with '?' or other. Using WideString representation is more flexible because it locale-independent.

2geobul: Have you tested your examples? What result they produce? And what your default system locale?
0
 
LVL 17

Expert Comment

by:geobul
ID: 6944996
Well, what is supposed to be produced? The second one shows:
5: #253#253#253#2#25

English(US)

Regards, Geo
0
 
LVL 1

Expert Comment

by:Alone
ID: 6945015
All depends on receiving strings original format: ANSI or Unicode. When they're ANSI, may possible to use AnsiString and StringToOleStr or direct StrOleVariant assignment.
But when string is Unicode, converting to ANSI representation may corrupt the data, replacing some characters with '?' or other. Using WideString representation is more flexible because it locale-independent.

2geobul: Have you tested your examples? What result they produce? And what your default system locale?
0
 
LVL 1

Expert Comment

by:Alone
ID: 6945055
Damn! My browser automatically resends messages!

When you're using Unicode on locale has FULL ANSI representation - on problem. But when your locale hasn't single-byte equivalent, you'll receive some question marks instead of characters and the data will corrupt.

In my expamples: Russian locale has full single-byte ANSI representation and all work ok. But Azerbaijani is Unicode-only and now we'we a big headache with one (!) letter :-(((
0

Featured Post

ScreenConnect 6.0 Free Trial

Explore all the enhancements in one game-changing release, ScreenConnect 6.0, based on partner feedback. New features include a redesigned UI, app configurations and chat acknowledgement to improve customer engagement!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

This article explains how to create forms/units independent of other forms/units object names in a delphi project. Have you ever created a form for user input in a Delphi project and then had the need to have that same form in a other Delphi proj…
Introduction Raise your hands if you were as upset with FireMonkey as I was when I discovered that there was no TListview.  I use TListView in almost all of my applications I've written, and I was not going to compromise by resorting to TStringGrid…
This Micro Tutorial hows how you can integrate  Mac OSX to a Windows Active Directory Domain. Apple has made it easy to allow users to bind their macs to a windows domain with relative ease. The following video show how to bind OSX Mavericks to …
This Micro Tutorial demonstrates using Microsoft Excel pivot tables, how to reverse engineer competitors' marketing strategies through backlinks.

803 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question