c# encoding confusion

kellyclu
kellyclu used Ask the Experts™
on
ERR, foreign characters are not displaying properly. if you go to http://www.starr.net/is/type/kbh.html, those are the foreign chars i'm using to test.

hi all, i'm trying to encode text from one format to another for the purpose of using the Length field of the string class to get the correct number of bytes for foreign characters entered in a text box.

here's an example:

aíóúüñ¡ºªaíóúüñ¡ºªaíóúüñ¡ºªaíóúüñ¡ºª

regular length is 36

our oracle database is using utf8, which translates that to a length of 68.

i found the convert method of asciiencoding and utf8encoding does the job, except for the fact the foreign characters are replaced with question marks.  but length does show 68.

how can i keep the foreign characters visible without losing the correct count?

textbox1 contents: aíóúüñ¡ºªaíóúüñ¡ºªaíóúüñ¡ºªaíóúüñ¡ºª

byte[] test =
        UTF8Encoding.Convert(Encoding.ASCII, Encoding.UTF8, ASCIIEncoding.UTF8.GetBytes(textBox1.Text.ToCharArray()));

        textBox7.Text = ASCIIEncoding.UTF8.GetString(test);   // shows a????????????????a????????????????a????????????????a????????????????

        textBox8.Text = textBox7.Text.Length.ToString();   // 68 correct
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
Commented:
Not sure if I understand you correctly, but you might want to try to add the following globalization element to your web.config.

<globalization requestEncoding="utf-8" responseEncoding="utf-8" />

Good luck,
CJ.

Author

Commented:
no help, that's default anyway.  all i'm trying to do is have the string length property return the number of utf8 bytes from a textbox entry.  what i get by default is one byte per character.

some foreign chars are considered 2 bytes, some one.  I'm overriding the textbox text property to encode so that the built in length will count the bytes properly.  I would adjust length, but the problem is string is sealed, and contains the length property.  

so i figure ok, encode the text property so the length property returns the same count as oracle using utf 8.

Commented:
You should have asked to delete this question. I have no other options left - nor the time to help you find out as to the why's and how to solve your issue...

Just my 2 cents....

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial