Link to home
Start Free TrialLog in
Avatar of IncognitoMan
IncognitoManFlag for Bulgaria

asked on

C# String With Cyrillic Digits To Byte Array

OK guys, I have a string and it contains cyrillic  characters and digits and latin characters. And I want to convert this string to a byte array, converting it to unicode following this table.
http://www.ibm.com/developerworks/linux/library/l-u-cyr/table4.jpg
For example if I have a cyrrilic "A" the byte value should be 0xC0.
Don't tell me to use System.Text.UTF8Encoding.UTF8.GetBytes(string str) as it returns ... stupid stings :).
Avatar of SkydiverFL
SkydiverFL
Flag of United States of America image

Won't ToCharArray() return the character array?  If so, can you not just convert the individual characters to the equiv bytes?
Avatar of IncognitoMan

ASKER

OK Ill try converting it to byte array and then convert it to byte aquvalents. Then I'll tell you the result. :)
Avatar of Ravi Vaddadi
Try this
UnicodeEncoding unicode = new UnicodeEncoding();
Byte[] encodedBytes = unicode.GetBytes(unicodeString);
Nope, the byte array returns stupid things. For example it returns values like 1040 for "¿". Maybe It's UTF-16. But how do I make the string UTF-8?
OK SriVaddadi I'll try it.
By the way the character in the post above was cyrillic "A".
This retirns to bytes for an "A" 0x16 and 0x04. Any other ideas :).
How about
ASCIIEncoding ascii = new ASCIIEncoding();
Byte[] encodedBytes = unicode.GetBytes(unicodeString);
This should work

int pageCode = 1251
Encoding encoding = Encoding.GetEncoding(pageCode);
Byte[] encodedBytes = encoding.GetBytes(unicodeString)
It again returns two bytes 208 and 144. Maybe I'll try with switch case statement :), but thats not a solution.
int pageCode = 1251
Encoding encoding = Encoding.GetEncoding(pageCode);
Byte[] encodedBytes = encoding.GetBytes(unicodeString)

This should work if it is not working then the page code mentioned at the url you posted is incorrect
Did you try it?
Encoding en = Encoding.GetEncoding(1251);
            MessageBox.Show(en.EncodingName);

This give the Encoding name as cyrillic correctly. If this is not working for you then issue might be something else
I've searched all over the net and tried all of the above, as it was in other sites. Nothing worked for me. I guess I'll be writing a switch case statement with over a hundred cases. :)
ASKER CERTIFIED SOLUTION
Avatar of Ravi Vaddadi
Ravi Vaddadi
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
It works as wire and electric current with switch case statement, but it's 350 lines.
Just because you searched all over the net for me, I will give you point ;).
So you could resolve the issue with switch satement?
Yes, but its a lot of coding for this little problem and sounds stupid. It's like to go from Russia to Germany and take the shortcut to USA :D.