How do I identify between ASCII and Unicode, and combine bytes as a unicode?

Hi all!

I got some  bytes from device. some are ASCII and some are unicode. How do I identify between ASCII and Unicode, and combine bytes as a unicode. I am not familiar with unicode. I hope someone could tell me how to do it. Thanks all!

Regards,
abdate
abdateAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

andrewjbCommented:
You can't tell. You can guess, perhaps, by assuming they are ASCII and checking if it looks like a sensble string being returned, but neither ASCII nor unicode have any additional information in them to indicate what they are - they are raw blocks of binary.

Perhaps you need to give a little more info on what you're doing?
0
abdateAuthor Commented:
Thanks andrewjb. I got a stream of Bytes from devices. There are may be ASCII and unicode combined together like this:
48 65 6C 6C 6F C1 C2 C1 C2
And I try to show these mesasge in String 'Hello謝謝'
My question is how do  I translate these bytes into String.
0
andrewjbCommented:
OK. Two problems:

1) How do you know what's ASCII and what's unicode? Why couldn't the C1 and C2 be ascii characters "A acute" and "A circumflex"? (if you're using Arial..)

2) Even if you do know somehow, the VCL components don't support unicode. Full stop. You need another component set if you want to be able to display them. Have a google. Perhaps http://tnt.ccci.org/delphi_unicode_controls/ , or there's another site which I can't find offhand...
0
Upgrade your Question Security!

Your question, your audience. Choose who sees your identity—and your question—with question security.

abdateAuthor Commented:
Thanks andrewjb!
I will download & study these unicode controls latter.
But now If I only concern that how to move these buffer bytes into string correctly. and  don't care which is ASCII or unicode.
I try move function like this:
Move(mBuffer(0),S1[1],20)      //S1:string,but I got exception code.

for i:=0 to 20 do
   S1:=S1+CHR(tmpBuf[i]);  //it  cannot move Bytes contain correctly to S1 except ASCII code

Any suggestion for move bytes buffer to string correctly?

abdate

 

 
0
andrewjbCommented:
maybe SetLength( string , len );
then a move.
0
abdateAuthor Commented:
Thanks andrewjb!
SetLength & Move function is working, but I got result S1 and which is not I want.
I try the following code, I think it is correct, if you could have some function for non ASCII code.

for i:=0 to 20 do
    begin
     if tmpBuf[i]<180 then      // if tmpbuf  byte <180 I suppose it is ASCII Byte
       S1:=S1+CHR(tmpBuf[i])
     else
       S1:=S1+.....               //do you have some function for not ASCII code
    end;

 
0
andrewjbCommented:
What's S1? If it's a 'string' then you CANNOT store unicode in it. It just isn't supported. At all. There's a WideString type which supports (only) unicode characters. Maybe you want that. Since all the ascii characters can be represented in unicode (with a leading byte of zero) they could be stored there. But none of the VCL components support WideString so you couldn't display anything..
0
abdateAuthor Commented:
Thanks andrewjb!
Yes,S1 is widestring. Bye the way, Japanese & Chinese characters are 2 bytes code and can be display in delphi string. I don't know how they do. but I will be appreciate if you can help me finish the following codeing.

procedure TForm1.Button1Click(Sender: TObject);
var S1:widestring;
      tmpBuf: array of byte
begin
SetLength(mBuffer,20);
for i:=0 to 20 do
    begin
     if tmpBuf[i]<180 then      // if tmpbuf  byte <180 I suppose it is ASCII Byte
       S1:=S1+CHR(tmpBuf[i])
     else
       S1:=S1+.....               //do you have some function for those not ASCII
    end;
end;

abdate
0
andrewjbCommented:
"Japanese & Chinese characters are 2 bytes code...."

They're not unicode. They use the multi-byte character set (MBCS), which IS supported by the VCL components, I believe.


What are you trying to do? Just get the bytes into a WideString? In your original example of :
48 65 6C 6C 6F C1 C2 C1 C2

what should the end result be? A wide string containing the characters:

48 = H
65 = e
6C = l
6c = l
6F = o
C1 C2 = ???
C1 C2 = ???

If you just do
AnsiString s;
s := tmpBuf

where tmpbuf is array of char (not byte)

then assign this to, say, a label caption it works fine..... IF the label is set to the right character set.
e.g. set it to CHINESEBIG5_CHARSET, and the above string appears as Hello then two chinese characters..
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
abdateAuthor Commented:
Thanks andrewjb!

I appreciate for your help!
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Delphi

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.