asked on

C# String With Cyrillic Digits To Byte Array

OK guys, I have a string and it contains cyrillic characters and digits and latin characters. And I want to convert this string to a byte array, converting it to unicode following this table.
http://www.ibm.com/developerworks/linux/library/l-u-cyr/table4.jpg
For example if I have a cyrrilic "A" the byte value should be 0xC0.
Don't tell me to use System.Text.UTF8Encoding.UTF8.GetBytes(string str) as it returns ... stupid stings :).

SkydiverFL

Won't ToCharArray() return the character array? If so, can you not just convert the individual characters to the equiv bytes?

IncognitoMan

ASKER

OK Ill try converting it to byte array and then convert it to byte aquvalents. Then I'll tell you the result. :)

Ravi Vaddadi

Try this
UnicodeEncoding unicode = new UnicodeEncoding();
Byte[] encodedBytes = unicode.GetBytes(unicodeString);

IncognitoMan

ASKER

Nope, the byte array returns stupid things. For example it returns values like 1040 for "¿". Maybe It's UTF-16. But how do I make the string UTF-8?

IncognitoMan

ASKER

OK SriVaddadi I'll try it.
By the way the character in the post above was cyrillic "A".

IncognitoMan

ASKER

This retirns to bytes for an "A" 0x16 and 0x04. Any other ideas :).

Ravi Vaddadi

How about
ASCIIEncoding ascii = new ASCIIEncoding();
Byte[] encodedBytes = unicode.GetBytes(unicodeString);

Ravi Vaddadi

This should work

int pageCode = 1251
Encoding encoding = Encoding.GetEncoding(pageCode);
Byte[] encodedBytes = encoding.GetBytes(unicodeString)

IncognitoMan

ASKER

It again returns two bytes 208 and 144. Maybe I'll try with switch case statement :), but thats not a solution.

Ravi Vaddadi

int pageCode = 1251
Encoding encoding = Encoding.GetEncoding(pageCode);
Byte[] encodedBytes = encoding.GetBytes(unicodeString)

This should work if it is not working then the page code mentioned at the url you posted is incorrect

Ravi Vaddadi

Did you try it?

Ravi Vaddadi

Encoding en = Encoding.GetEncoding(1251);
MessageBox.Show(en.EncodingName);

This give the Encoding name as cyrillic correctly. If this is not working for you then issue might be something else

IncognitoMan

ASKER

I've searched all over the net and tried all of the above, as it was in other sites. Nothing worked for me. I guess I'll be writing a switch case statement with over a hundred cases. :)

ASKER CERTIFIED SOLUTION

Ravi Vaddadi

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

IncognitoMan

ASKER

It works as wire and electric current with switch case statement, but it's 350 lines.

IncognitoMan

ASKER

Just because you searched all over the net for me, I will give you point ;).