Want to win a PS4? Go Premium and enter to win our High-Tech Treats giveaway. Enter to Win

x
?
Solved

Strange behavior with WideString conversion

Posted on 2002-04-15
21
Medium Priority
?
563 Views
Last Modified: 2010-04-04
procedure TForm1.TisButton1Click(Sender: TObject);
  procedure test(const Param1: WideString);
  var
    s,d:string;
    i,len:integer;
  begin
    s:=param1;
    len:=length(s);
    d:=Inttostr(len)+': ';
    for i:=1 to len do
      d:=d+' '+inttostr(ord(s[i]));
    ShowMessage(d);
  end;
var
  s:string;
begin
  s:=#253#253#253#02#25;
  test(s);
end;

After call test, the result is #253#253#63#25, Why?

Thanks.
0
Comment
Question by:HBZhang
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 11
  • 5
  • 3
  • +2
21 Comments
 
LVL 1

Expert Comment

by:MBo
ID: 6943694
I've tried.
Result- 5:253 253 253 2 25
It looks like localization problem?
Is your Windows English? (My is Russian)
0
 

Author Comment

by:HBZhang
ID: 6943710
No, my is Chinese.
0
 
LVL 1

Expert Comment

by:Alone
ID: 6943941
Hi!

Your error is in this line: s:=param1;

Because Param1 is WideString and S is string that line is equivalent of WideCharToString (or similar) function call. This conversion depends on your system locale settings and behavior differs between Russian and Chinese locales :-))

Try this:

procedure TMainForm.Button1Click(Sender: TObject);

procedure Test(const Param: WideString);
var
  S: string;
  I, Len: Integer;
  P: Pointer;
begin
  Len := Length(Param) * SizeOf(Param[1]); // size in bytes
  S := IntToStr(Len) + ',';
  for I := 0 to Len - 1 do
    S := S + '#' + IntToStr(Byte(AnsiString(Pointer(Param))[I]));
  ShowMessage(S);
end;

begin
  Test(#204#224#236#224);
// shows 8,#0#28#4#48#4#60#4#48  
end;

My system is Russian too :-))))


0
Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 

Author Comment

by:HBZhang
ID: 6943996
I'm call a COM object function with a string as parameter. Because COM only support WideString, then problem occurs.

Is there a way to exchange data between String and WideString safely?

Thanks.
0
 
LVL 1

Expert Comment

by:Alone
ID: 6944037
For Chinese? For what purpose?

var
  S1, S2: string;
  W: WideString;
begin
  S1 := 'blabla';  // ANSI 'blabla'
  W := S1;         // ANSI #0'b'#0'l'#0'a'#0'b'#0'l'#0'a'
  S2 := W;         // ANSI 'blabla'
end;

Don't cofuse with ANSI (may be multibyte) and Unicode characters. They are MUST BE different! My previous example shows how Unicode string with Russian characters looks at low-level (bytes chain).

For european languages (and Russian)  ANSI strings are always single-byte. When we assigning them to WideString, they're expanding to double-byte Unicode characters, for my example, 4 bytes (4 single-byte characters) to 8 bytes (4 double-byte characters).

How many Chinese characters are in your sample: #253#253#253#02#25?

Please try current example (blablabla). If strings are not corrupted your system still is working ok. :-))
0
 
LVL 1

Expert Comment

by:Alone
ID: 6944048
For Chinese? For what purpose?

var
  S1, S2: string;
  W: WideString;
begin
  S1 := 'blabla';  // ANSI 'blabla'
  W := S1;         // ANSI #0'b'#0'l'#0'a'#0'b'#0'l'#0'a'
  S2 := W;         // ANSI 'blabla'
end;

Don't cofuse with ANSI (may be multibyte) and Unicode characters. They are MUST BE different! My previous example shows how Unicode string with Russian characters looks at low-level (bytes chain).

For european languages (and Russian)  ANSI strings are always single-byte. When we assigning them to WideString, they're expanding to double-byte Unicode characters, for my example, 4 bytes (4 single-byte characters) to 8 bytes (4 double-byte characters).

How many Chinese characters are in your sample: #253#253#253#02#25?

Please try current example (blablabla). If strings are not corrupted your system still is working ok. :-))
0
 
LVL 1

Expert Comment

by:Alone
ID: 6944050
Sorry my message sent twice :-((
0
 

Author Comment

by:HBZhang
ID: 6944068
Create a variant array with varByte type can avoid this problem, but it's no simple.

I wonder to know is there something can control this conversion? Anyway, i think it as strange behavior.

  WideString := String;
  String := WideString;

Changed? Why?
0
 
LVL 1

Expert Comment

by:Alone
ID: 6944089
For Russian:

 WideString := String;
 String := WideString;

works fine.

But for Azeri no:

W := 'az'#609'ri';
S := W; // looks as 'az?ri';
W := 'S' // looks as 'az?ri'; but in Unicode :-((

This behavior depends on ANSI (single-byte in my case) character set restriction. Some Unicode characters has no equivalent in single-byte and system replace them with '?'

Am I right?
0
 
LVL 1

Expert Comment

by:Alone
ID: 6944125
Yep! When you place strings direclty in your program source Delphi ALWAYS creates ANSI strings.

procedure TMainForm.Button2Click(Sender: TObject);

function ComFunction(const Param: OleVariant): Integer;
begin
// Works with NT/2k/XP only ;-)
  Result := MessageBoxW(Handle, Pointer(WideString(Param)), '', MB_ICONINFORMATION);
end;

var
  S: string;
  W: WideString;
  C: WideChar;

begin
  ComFunction('Direct: az'#$018F'ri'); // Delphi creates an ANSI (?) string 'az'#$8F'ri'
// may be Unicode string BUT depends on system locale (my locale is Russian but string is in Azeri (Azerbaijani))
  C := #$018F;
  ComFunction('WideChar: Az'+C+'ri'); // works fine
end;
0
 

Author Comment

by:HBZhang
ID: 6944165
Thanks. But it's not good enough. I'am waiting...

BTW: In my really work, I donnot "place strings direclty in your program source". It comes from rs232 port.
0
 
LVL 1

Expert Comment

by:Alone
ID: 6944181
Here is your sample like:

var
  W: WideString;

begin
  W := 'az'#$018F'ri'; // Delphi creates #$0041#$007A + #$040F + #0072#0069
// instead of #$0041#$007A + #$018F+ #0072#0069
// third characted replaced with Russian (using system locale)
end;
0
 
LVL 1

Accepted Solution

by:
Alone earned 600 total points
ID: 6944189
If string "comes from RS232" try to receive it into WideString variable and NEVER covert it into single-byte. Always use WideString.
0
 

Author Comment

by:HBZhang
ID: 6944205
Yeah, always use WideString may be a good idea.
0
 
LVL 12

Expert Comment

by:Lee_Nover
ID: 6944714
why not simply use StringToOLEStr :)
0
 
LVL 17

Expert Comment

by:geobul
ID: 6944757
Hi,

procedure test(const Param1: WideString);
var
  s: string;
  wc: PWideChar;
begin
  wc := PWideChar(Param1);
  s := WideCharToString(wc);
  s := IntToStr(Length(s)) + ': ' + s;
  ShowMessage(s);
end;

procedure TForm1.Button1Click(Sender: TObject);
var
  ws: WideString;
begin
  ws := #253#253#253#02#25;
  test(ws);
end;

Regards, Geo
0
 
LVL 17

Expert Comment

by:geobul
ID: 6944820
Or:

procedure TForm1.TisButton1Click(Sender: TObject);
  procedure test(const Param1: WideString);
  var
    s,d: string;
    wc: PWideChar;
    len,i: integer;
  begin
    wc := PWideChar(Param1);
    s := WideCharToString(wc);
    len := length(s);
    d := IntToStr(len) + ': ';
    for i := 1 to len do
      d := d + '#' + inttostr(ord(s[i]));
    ShowMessage(d);
  end;
var
 s:string;
begin
 s:=#253#253#253#02#25;
 test(s);
end;
0
 
LVL 1

Expert Comment

by:Alone
ID: 6944895
All depends on receiving strings original format: ANSI or Unicode. When they're ANSI, may possible to use AnsiString and StringToOleStr or direct StrOleVariant assignment.
But when string is Unicode, converting to ANSI representation may corrupt the data, replacing some characters with '?' or other. Using WideString representation is more flexible because it locale-independent.

2geobul: Have you tested your examples? What result they produce? And what your default system locale?
0
 
LVL 17

Expert Comment

by:geobul
ID: 6944996
Well, what is supposed to be produced? The second one shows:
5: #253#253#253#2#25

English(US)

Regards, Geo
0
 
LVL 1

Expert Comment

by:Alone
ID: 6945015
All depends on receiving strings original format: ANSI or Unicode. When they're ANSI, may possible to use AnsiString and StringToOleStr or direct StrOleVariant assignment.
But when string is Unicode, converting to ANSI representation may corrupt the data, replacing some characters with '?' or other. Using WideString representation is more flexible because it locale-independent.

2geobul: Have you tested your examples? What result they produce? And what your default system locale?
0
 
LVL 1

Expert Comment

by:Alone
ID: 6945055
Damn! My browser automatically resends messages!

When you're using Unicode on locale has FULL ANSI representation - on problem. But when your locale hasn't single-byte equivalent, you'll receive some question marks instead of characters and the data will corrupt.

In my expamples: Russian locale has full single-byte ANSI representation and all work ok. But Azerbaijani is Unicode-only and now we'we a big headache with one (!) letter :-(((
0

Featured Post

Tech or Treat! - Giveaway

Submit an article about your scariest tech experience—and the solution—and you’ll be automatically entered to win one of 4 fantastic tech gadgets.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This article explains how to create forms/units independent of other forms/units object names in a delphi project. Have you ever created a form for user input in a Delphi project and then had the need to have that same form in a other Delphi proj…
Introduction Raise your hands if you were as upset with FireMonkey as I was when I discovered that there was no TListview.  I use TListView in almost all of my applications I've written, and I was not going to compromise by resorting to TStringGrid…
Want to learn how to record your desktop screen without having to use an outside camera. Click on this video and learn how to use the cool google extension called "Screencastify"! Step 1: Open a new google tab Step 2: Go to the left hand upper corn…
Are you ready to place your question in front of subject-matter experts for more timely responses? With the release of Priority Question, Premium Members, Team Accounts and Qualified Experts can now identify the emergent level of their issue, signal…
Suggested Courses

636 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question