[Webinar] Streamline your web hosting managementRegister Today

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 4866
  • Last Modified:

How do I convert a ANSI Number to a Unicode Number in C#, for output in XSL:FO

This is my first question on the site

An application that I am writing reads an RTF document I supply but I have no control over the RTF's content.  In the RTF data, it contains special characters such as the Euro symbol (¬ - ANSI 128)

My application parses the RTF and outputs it as XSL:FO.  A later step renders the XSL:FO into a PDF document.  The problem is that, from what I've found, the special characters need to be converted to their UNICODE equivalents in order for them to appear in the pdf correctly.  Based on a table found here: http://www.alanwood.net/demos/ansi.html, I can see that the unicode number equivalent for the Euro (ANSI 128) is 8364.

In C#, how do I convert a ANSI number to a Unicode number?  I've tried the attached code bit, but it is just changing encoding types of the numbers, and not actually converting them.

Failing that, is there another way to output characters in FO such that I can use their ANSI number instead of their unicode number?  Here is how I output a ® in FO (ANSI and Unicode number 174)

<fo:inline font-size="0.50em" baseline-shift="super" font-family="Times New Roman" color="#000000">&#174;</fo:inline>

Attached is a .fo file with multiple special characters in it to show what I am trying to do, saved as a .txt file due to file upload restrictions on this site.  I am using RenderX's XEP rendering engine to eventually render from XSL:FO to PDF which is where I can tell the characters arn't translating correctly.

Thanks!
System.Text.Encoding ansi = System.Text.Encoding.Default;
                                System.Text.Encoding unicode = System.Text.Encoding.UTF8;
 
                                byte[] ansibytes = ansi.GetBytes(MyAnsiCharacterCode);
                                
 
                                byte[] unicodebytes = System.Text.Encoding.Convert(ansi, unicode, ansibytes);
                                char[] unicodechars = new char[unicode.GetCharCount(unicodebytes, 0, unicodebytes.Length)];
                                unicode.GetChars(unicodebytes, 0, unicodebytes.Length, unicodechars, 0);
 
                                string unicodestring = new string(unicodechars);

Open in new window

character-example.txt
0
ntasker02
Asked:
ntasker02
  • 4
1 Solution
 
dkloeckCommented:
0
 
ntasker02Author Commented:
The example code I posted is a slight variation of the code found on that page

The full link is http://msdn.microsoft.com/library/default.asp?url=/library/en-us/cpref/html/frlrfsystemtextencodingclasstopic.asp

Unfortunately that won't work for what I am trying to do.  I either need a way to get the unicode number equivalent of an ANSI code, or a way for XSL:FO to accept ANSI codes as special characters to output
0
 
ntasker02Author Commented:
As a followup to my above comment, I would like to know if there is a way to get the unicode number for an ANSI number, without using a manually created character map table/class.  I can easily say "if I see ansi 128, output unicode character 8364" using a map.  I'd like an automated lookup without the need for hardcoding characters
0
The new generation of project management tools

With monday.com’s project management tool, you can see what everyone on your team is working in a single glance. Its intuitive dashboards are customizable, so you can create systems that work for you.

 
ntasker02Author Commented:
I have solved the problem myself.  The code is attached for anyone that might come across this same problem.

I create the string "s" by taking my input ansi number such as 128 for the Euro Symbol (using codepage 1252).  I then convert that number to hex where '128' becomes '80'.  Next I convert that hex number by casting 0x80 to a byte, and then using the System.Text.Encoding class object cp1252enc I made which represents codepage 1252, and creating the string equivalent of the byte I just made.  My string now contains the ¬ (euro) symbol.

From there I can use a different encoding object, in this case utf8enc, to convert my string to a byte array in UTF8 encoding, and eventually getting the utf8 equivalent number.
                                System.Text.Encoding cp1252enc = System.Text.Encoding.GetEncoding(1252);
                                System.Text.Encoding utf8enc = System.Text.Encoding.UTF8;
 
                                string ParamaterAsHex = tok.Parameter.ToString("X");
 
                                ParamaterAsHex = "0x" + ParamaterAsHex;
 
                                string s = Encoding.GetEncoding(1252).GetString(new byte[] { Convert.ToByte(ParamaterAsHex, 16) });
 
 
                                byte[] utf8bytes = Encoding.UTF8.GetBytes(s);
                                char[] utf8chars = Encoding.UTF8.GetChars(utf8bytes);
 
                                int utf8int = (int)utf8chars[0];

Open in new window

0
 
ntasker02Author Commented:
Silly bit of code in there.  Updated code, solution stands
                                System.Text.Encoding cp1252enc = System.Text.Encoding.GetEncoding(1252);
                                System.Text.Encoding utf8enc = System.Text.Encoding.UTF8;
 
                                string s = cp1252enc.GetString(new byte[] { Convert.ToByte(tok.Parameter.ToString(), 10) });
 
                                byte[] utf8bytes = Encoding.UTF8.GetBytes(s);
                                char[] utf8chars = Encoding.UTF8.GetChars(utf8bytes);
 
                                int utf8int = (int)utf8chars[0];

Open in new window

0
 
Computer101Commented:
Closed, 200 points refunded.
Computer101
EE Admin
0

Featured Post

The new generation of project management tools

With monday.com’s project management tool, you can see what everyone on your team is working in a single glance. Its intuitive dashboards are customizable, so you can create systems that work for you.

  • 4
Tackle projects and never again get stuck behind a technical roadblock.
Join Now