• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 4257
  • Last Modified:

C# String With Cyrillic Digits To Byte Array

OK guys, I have a string and it contains cyrillic  characters and digits and latin characters. And I want to convert this string to a byte array, converting it to unicode following this table.
http://www.ibm.com/developerworks/linux/library/l-u-cyr/table4.jpg
For example if I have a cyrrilic "A" the byte value should be 0xC0.
Don't tell me to use System.Text.UTF8Encoding.UTF8.GetBytes(string str) as it returns ... stupid stings :).
0
IncognitoMan
Asked:
IncognitoMan
  • 9
  • 8
1 Solution
 
SkydiverFLCommented:
Won't ToCharArray() return the character array?  If so, can you not just convert the individual characters to the equiv bytes?
0
 
IncognitoManAuthor Commented:
OK Ill try converting it to byte array and then convert it to byte aquvalents. Then I'll tell you the result. :)
0
 
SriVaddadiCommented:
Try this
UnicodeEncoding unicode = new UnicodeEncoding();
Byte[] encodedBytes = unicode.GetBytes(unicodeString);
0
Concerto Cloud for Software Providers & ISVs

Can Concerto Cloud Services help you focus on evolving your application offerings, while delivering the best cloud experience to your customers? From DevOps to revenue models and customer support, the answer is yes!

Learn how Concerto can help you.

 
IncognitoManAuthor Commented:
Nope, the byte array returns stupid things. For example it returns values like 1040 for "¿". Maybe It's UTF-16. But how do I make the string UTF-8?
0
 
IncognitoManAuthor Commented:
OK SriVaddadi I'll try it.
By the way the character in the post above was cyrillic "A".
0
 
IncognitoManAuthor Commented:
This retirns to bytes for an "A" 0x16 and 0x04. Any other ideas :).
0
 
SriVaddadiCommented:
How about
ASCIIEncoding ascii = new ASCIIEncoding();
Byte[] encodedBytes = unicode.GetBytes(unicodeString);
0
 
SriVaddadiCommented:
This should work

int pageCode = 1251
Encoding encoding = Encoding.GetEncoding(pageCode);
Byte[] encodedBytes = encoding.GetBytes(unicodeString)
0
 
IncognitoManAuthor Commented:
It again returns two bytes 208 and 144. Maybe I'll try with switch case statement :), but thats not a solution.
0
 
SriVaddadiCommented:
int pageCode = 1251
Encoding encoding = Encoding.GetEncoding(pageCode);
Byte[] encodedBytes = encoding.GetBytes(unicodeString)

This should work if it is not working then the page code mentioned at the url you posted is incorrect
0
 
SriVaddadiCommented:
Did you try it?
0
 
SriVaddadiCommented:
Encoding en = Encoding.GetEncoding(1251);
            MessageBox.Show(en.EncodingName);

This give the Encoding name as cyrillic correctly. If this is not working for you then issue might be something else
0
 
IncognitoManAuthor Commented:
I've searched all over the net and tried all of the above, as it was in other sites. Nothing worked for me. I guess I'll be writing a switch case statement with over a hundred cases. :)
0
 
SriVaddadiCommented:
You do not need a switch statement. you need the correct windows code page. 1251 is the windows code page for Cyrillic. If tht is not what you are looking for then the language might be different with different page code. once you get the page code the code snippet in my last post should work
0
 
IncognitoManAuthor Commented:
It works as wire and electric current with switch case statement, but it's 350 lines.
0
 
IncognitoManAuthor Commented:
Just because you searched all over the net for me, I will give you point ;).
0
 
SriVaddadiCommented:
So you could resolve the issue with switch satement?
0
 
IncognitoManAuthor Commented:
Yes, but its a lot of coding for this little problem and sounds stupid. It's like to go from Russia to Germany and take the shortcut to USA :D.
0

Featured Post

VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

  • 9
  • 8
Tackle projects and never again get stuck behind a technical roadblock.
Join Now