wipnav
asked on
How to display UTF-8 characters from database
Hi,
I am using a TSQLQuery component to query a UTF-8 encoded (Kanji characters) Sybase database. I am using the VISOCO Sybase dbExpress driver.
I assume that I need to do some conversion before I can call TextOut to display the string of Kanji characters. My question is, how do I do it?
Regards,
Bill
I am using a TSQLQuery component to query a UTF-8 encoded (Kanji characters) Sybase database. I am using the VISOCO Sybase dbExpress driver.
I assume that I need to do some conversion before I can call TextOut to display the string of Kanji characters. My question is, how do I do it?
Regards,
Bill
you can call TextOutW(PWideChar(UTF8Dec ode(UTF8St ringFromDB ))...)
ASKER
Lee,
The database field is of type varchar(255). How do I specify it other than with .AsString? That didn't work when I tried:
TextOutW(Canvas.Handle, 10, 10, PWideChar(UTF8Decode(QSQL1 col_name.A sString)), Length(UTF8Decode(QSQL1col _name.AsSt ring)));
All I got was a bunch of '?' characters.
Bill
The database field is of type varchar(255). How do I specify it other than with .AsString? That didn't work when I tried:
TextOutW(Canvas.Handle, 10, 10, PWideChar(UTF8Decode(QSQL1
All I got was a bunch of '?' characters.
Bill
if your strings are UTF8 encoded then it should work ... you might have other issues
can you copy/paste a n UTF8 string from your db so I can try :)
can you copy/paste a n UTF8 string from your db so I can try :)
ASKER
How do you suggest I get the string from the database so that I can copy it for you?
using some dbadmin tool that displays the string as ANSI .. or in delphi from a dbaware control :)
ASKER
This is what I get in a dbaware control: ?????????????????
then it's not UTF-8
example UTF-8 string: 'test čšĹ& #318;ÄR 16;ć ;'
example UTF-8 string: 'test čšĹ&
ASKER
Well, I now know what the problem is, and I have had some success.
The data is UTF-8 encoded, and if I add a conversion to my SQL like:
convert(varbinary(255), a_var_char_column)
I can get the UTF-8 characters with:
MyUTF8String := SQLDataSet.Fields[AField]. AsString;
The "AsString" is taking the binary data and putting it into my string without any conversions.
From that point on it's easy to decode them and display them.
Now, my problem is, how to get UTF-8 encoded characters from a "text" field? Sybase won't allow me to convert text to any sort of binary datatype, so I can't use the same trick.
I've tried to get binary data from a text field, but with no success. I think if I can just get the raw, unconverted bytes, I will be okay.
The data is UTF-8 encoded, and if I add a conversion to my SQL like:
convert(varbinary(255), a_var_char_column)
I can get the UTF-8 characters with:
MyUTF8String := SQLDataSet.Fields[AField].
The "AsString" is taking the binary data and putting it into my string without any conversions.
From that point on it's easy to decode them and display them.
Now, my problem is, how to get UTF-8 encoded characters from a "text" field? Sybase won't allow me to convert text to any sort of binary datatype, so I can't use the same trick.
I've tried to get binary data from a text field, but with no success. I think if I can just get the raw, unconverted bytes, I will be okay.
OK, I might be a late comer and might not understand what you said by "I've tried to get binary data from a text field", does that mean you have tried to get it as BLOB? If you have not, well, it is worth a try.
ASKER
This is what I tried:
procedure TForm1.Button1Click(Sender : TObject);
var
T: TBlobField;
MS: TMemoryStream;
UTF8Str: UTF8String;
UCS2Str: WideString;
begin
T := TBlobField.Create(SQLQuery 1);
T.FieldName := 'my_text_field';
T.Name := SQLQuery1.Name + T.FieldName;
T.Index := SQLQuery1.FieldCount;
T.DataSet := SQLQuery1;
SQLQuery1.FieldDefs.UpDate ;
SQLQuery1.Open;
MS := TMemoryStream.Create;
try
T.SaveToStream(MS);
MS.Seek(0, 0);
SetLength(UTF8Str, MS.Size);
Move(MS.Memory^, UTF8Str[1], MS.Size);
UCS2Str := UTF8Decode(UTF8Str);
TextOutW(Form1.Canvas.Hand le, 10, 10, PWideChar(UCS2Str), Length(UCS2Str));
finally
MS.Free;
end;
end;
I don't get the correct international characters, mainly lots of '?'.
procedure TForm1.Button1Click(Sender
var
T: TBlobField;
MS: TMemoryStream;
UTF8Str: UTF8String;
UCS2Str: WideString;
begin
T := TBlobField.Create(SQLQuery
T.FieldName := 'my_text_field';
T.Name := SQLQuery1.Name + T.FieldName;
T.Index := SQLQuery1.FieldCount;
T.DataSet := SQLQuery1;
SQLQuery1.FieldDefs.UpDate
SQLQuery1.Open;
MS := TMemoryStream.Create;
try
T.SaveToStream(MS);
MS.Seek(0, 0);
SetLength(UTF8Str, MS.Size);
Move(MS.Memory^, UTF8Str[1], MS.Size);
UCS2Str := UTF8Decode(UTF8Str);
TextOutW(Form1.Canvas.Hand
finally
MS.Free;
end;
end;
I don't get the correct international characters, mainly lots of '?'.
Now, I don't understand... what do you mean by
>I can get the UTF-8 characters with:
>MyUTF8String := SQLDataSet.Fields[AField]. AsString;
So, if AsString already works... ?
>I can get the UTF-8 characters with:
>MyUTF8String := SQLDataSet.Fields[AField].
So, if AsString already works... ?
ASKER
AsString only works when I use this conversion in my SQL:
convert(varbinary(255), a_var_char_column)
returning a binary result. It doesn't work for a text field.
convert(varbinary(255), a_var_char_column)
returning a binary result. It doesn't work for a text field.
procedure TForm1.Button1Click(Sender : TObject);
var
MS: TStringStream;
UCS2Str: WideString;
begin
MS := TStringStream.Create;
try
TBlobField(SQLQuery1.Field ByName('my _text_fiel d')).SaveT oStream(MS );
UCS2Str := UTF8Decode(MS.DataString);
TextOutW(Form1.Canvas.Hand le, 10, 10, PWideChar(UCS2Str), Length(UCS2Str));
finally
MS.Free;
end;
end;
or even easier:
UCS2Str := UTF8Decode(SQLQuery1.Field ByName('my _text_fiel d').AsBlob );
TextOutW(Form1.Canvas.Hand le, 10, 10, PWideChar(UCS2Str), Length(UCS2Str));
I must ask .. do you have the necessary fonts to display those chars ? :)
var
MS: TStringStream;
UCS2Str: WideString;
begin
MS := TStringStream.Create;
try
TBlobField(SQLQuery1.Field
UCS2Str := UTF8Decode(MS.DataString);
TextOutW(Form1.Canvas.Hand
finally
MS.Free;
end;
end;
or even easier:
UCS2Str := UTF8Decode(SQLQuery1.Field
TextOutW(Form1.Canvas.Hand
I must ask .. do you have the necessary fonts to display those chars ? :)
ASKER
Your first code returned the same string of '?' as my code.
Your second code didn't compile because AsBlob is undeclared.
Yes, my computer can display the international characters. It works when I have a varchar field that I can convert to varbinary.
Your second code didn't compile because AsBlob is undeclared.
Yes, my computer can display the international characters. It works when I have a varchar field that I can convert to varbinary.
without calling UTF8Decode, can show us a sample of your Blob?
ASKER
I have no way to do that, but the blob consists of UTF-8 encoded unicode characters.
The problem seems to be that dbExpress doesn't provide a way to get the raw data for a text field.
The problem seems to be that dbExpress doesn't provide a way to get the raw data for a text field.
As in, can't you just do a Memo1.Text := MS.DataString; after loading your blob into your string stream? So that we can verify and know what is being retrieved?
ASKER
This is all that I get in the memo:
???????????????24????????? ?????????? ????????
The data has been improperly translated by the time it makes in to the stream.
???????????????24?????????
The data has been improperly translated by the time it makes in to the stream.
Another thought, perhaps the error occured while you were INSERTing the value in the database?
*Yawn*... 6.45am now, gotta go catch some sleep...
*Yawn*... 6.45am now, gotta go catch some sleep...
ASKER
I can get the data out just fine with ADO, but I want to use dbExpress.
Tomorrow is another day. Thanks.
Tomorrow is another day. Thanks.
ASKER
It turns out that I needed a newer version of the Sybase driver from VISOCO. The new version supports an option to not do any character conversions. Now I can get the UTF-8 characters from dbExpress.
ASKER
I answered my own question. I would like a refund.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Have a look at the TNT controls which support Unicode. http://tnt.ccci.org/delphi_unicode_controls/