Solved

byte[] number = new UTF8Encoding(true).GetBytes(i.ToString());  trying to get an explanation what this line of code do...

Posted on 2014-10-01
5
249 Views
Last Modified: 2014-10-02
trying to get an explanation what this line of code do...

byte[] number = new UTF8Encoding(true).GetBytes(i.ToString());
0
Comment
Question by:yguyon28
5 Comments
 
LVL 33

Assisted Solution

by:it_saige
it_saige earned 167 total points
ID: 40355736
You are taking an object [i] and calling the ToString method for this object.  Once the string representation of the object [i] is returned, you encode the string of characters into a sequence of bytes.

From the MSDN Article:
using System;
using System.Text;

public class SamplesEncoding  {

   public static void Main()  {

      // The characters to encode: 
      //    Latin Small Letter Z (U+007A) 
      //    Latin Small Letter A (U+0061) 
      //    Combining Breve (U+0306) 
      //    Latin Small Letter AE With Acute (U+01FD) 
      //    Greek Small Letter Beta (U+03B2) 
      //    a high-surrogate value (U+D8FF) 
      //    a low-surrogate value (U+DCFF)
      String myStr = "za\u0306\u01FD\u03B2\uD8FF\uDCFF";

      // Get different encodings.
      Encoding  u7    = Encoding.UTF7;
      Encoding  u8    = Encoding.UTF8;
      Encoding  u16LE = Encoding.Unicode;
      Encoding  u16BE = Encoding.BigEndianUnicode;
      Encoding  u32   = Encoding.UTF32;

      // Encode the entire string, and print out the counts and the resulting bytes.
      Console.WriteLine( "Encoding the entire string:" );
      PrintCountsAndBytes( myStr, u7 );
      PrintCountsAndBytes( myStr, u8 );
      PrintCountsAndBytes( myStr, u16LE );
      PrintCountsAndBytes( myStr, u16BE );
      PrintCountsAndBytes( myStr, u32 );

      Console.WriteLine();

      // Encode three characters starting at index 4, and print out the counts and the resulting bytes.
      Console.WriteLine( "Encoding the characters from index 4 through 6:" );
      PrintCountsAndBytes( myStr, 4, 3, u7 );
      PrintCountsAndBytes( myStr, 4, 3, u8 );
      PrintCountsAndBytes( myStr, 4, 3, u16LE );
      PrintCountsAndBytes( myStr, 4, 3, u16BE );
      PrintCountsAndBytes( myStr, 4, 3, u32 );

   }


   public static void PrintCountsAndBytes( String s, Encoding enc )  {

      // Display the name of the encoding used.
      Console.Write( "{0,-30} :", enc.ToString() );

      // Display the exact byte count. 
      int iBC  = enc.GetByteCount( s );
      Console.Write( " {0,-3}", iBC );

      // Display the maximum byte count. 
      int iMBC = enc.GetMaxByteCount( s.Length );
      Console.Write( " {0,-3} :", iMBC );

      // Encode the entire string. 
      byte[] bytes = enc.GetBytes( s );

      // Display all the encoded bytes.
      PrintHexBytes( bytes );

   }

   public static void PrintCountsAndBytes( String s, int index, int count, Encoding enc )  {

      // Display the name of the encoding used.
      Console.Write( "{0,-30} :", enc.ToString() );

      // Display the exact byte count. 
      int iBC  = enc.GetByteCount( s.ToCharArray(), index, count );
      Console.Write( " {0,-3}", iBC );

      // Display the maximum byte count. 
      int iMBC = enc.GetMaxByteCount( count );
      Console.Write( " {0,-3} :", iMBC );

      // Encode a range of characters in the string. 
      byte[] bytes = new byte[iBC];
      enc.GetBytes( s, index, count, bytes, bytes.GetLowerBound(0) );

      // Display all the encoded bytes.
      PrintHexBytes( bytes );

   }


   public static void PrintHexBytes( byte[] bytes )  {

      if (( bytes == null ) || ( bytes.Length == 0 ))
         Console.WriteLine( "<none>" );
      else  {
         for ( int i = 0; i < bytes.Length; i++ )
            Console.Write( "{0:X2} ", bytes[i] );
         Console.WriteLine();
      }

   }

}


/* 
This code produces the following output.

Encoding the entire string:
System.Text.UTF7Encoding       : 18  23  :7A 61 2B 41 77 59 42 2F 51 4F 79 32 50 2F 63 2F 77 2D
System.Text.UTF8Encoding       : 12  24  :7A 61 CC 86 C7 BD CE B2 F1 8F B3 BF
System.Text.UnicodeEncoding    : 14  16  :7A 00 61 00 06 03 FD 01 B2 03 FF D8 FF DC
System.Text.UnicodeEncoding    : 14  16  :00 7A 00 61 03 06 01 FD 03 B2 D8 FF DC FF
System.Text.UTF32Encoding      : 24  32  :7A 00 00 00 61 00 00 00 06 03 00 00 FD 01 00 00 B2 03 00 00 FF FC 04 00

Encoding the characters from index 4 through 6:
System.Text.UTF7Encoding       : 10  11  :2B 41 37 4C 59 2F 39 7A 2F 2D
System.Text.UTF8Encoding       : 6   12  :CE B2 F1 8F B3 BF
System.Text.UnicodeEncoding    : 6   8   :B2 03 FF D8 FF DC
System.Text.UnicodeEncoding    : 6   8   :03 B2 D8 FF DC FF
System.Text.UTF32Encoding      : 8   16  :B2 03 00 00 FF FC 04 00

*/

Open in new window


http://msdn.microsoft.com/en-us/library/ds4kkd55(v=vs.110).aspx

-saige-
0
 
LVL 62

Assisted Solution

by:Fernando Soto
Fernando Soto earned 166 total points
ID: 40356252
Hi yguyon28;

The explanation the the statement:

byte[] number = new UTF8Encoding(true).GetBytes(i.ToString());

The UTF8Encoding object creates an object that converts one type of encoding to another. The true parameter that is passed in tells the class to provide a BOM, Byte Oder Mark, at the beginning of the byte of characters. The BOM tells the object that receives it how the bytes are constructed whether they are in big-endian or little-endian order, there are other meaning if the BOM is found in the middle of the byte array, see the link for BOM for more info. The GetBytes method takes the string passed in as a parameter and returns an array of bytes.  This is done so that the system receiving it knows how to correctly interpret the byte array. Therefore the number variable is a byte array with a byte order mark at the beginning followed by the bytes.
0
 
LVL 33

Accepted Solution

by:
sarabande earned 167 total points
ID: 40356781
to add to above comments:

if i is an integer, say 12345, it would be turned to a byte array <BOM>+"12345", what is

EF BB BF 31 32 33 34 35

in hex digits. the 3-byte BOM says (among other things, see also comment of Fernando) that the following is UTF-8 encoded. windows text files which begin with a BOM would be correctly encoded (displayed) by most editors and text processing programs. if using a dump editor, the BOM and multi-byte UTF-8 characters would show non-printable or wrong characters. the BOM is only needed at the begin of a file, so I would assume that the statement was to create the start or header sequence of a text file.

note, integer digits are ascii codes where UTF-8 has same coding. that could be different if 'i' is not an integer but for example a currency object where the currency sign is a non-ascii (for example £ or €).

Sara
0
 
LVL 33

Expert Comment

by:sarabande
ID: 40356975
yguon28, before closing a question with a 'B' grade, you may consider to ask for anything which has been not answered or was unclear. I think it is fair to give us volunteers a chance to go for an 'A'. thanks.

Sara
0
 

Author Comment

by:yguyon28
ID: 40357063
Will do Sara
0

Featured Post

Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Many of us here at EE write code. Many of us write exceptional code; just as many of us write exception-prone code. As we all should know, exceptions are a mechanism for handling errors which are typically out of our control. From database errors, t…
Real-time is more about the business, not the technology. In day-to-day life, to make real-time decisions like buying or investing, business needs the latest information(e.g. Gold Rate/Stock Rate). Unlike traditional days, you need not wait for a fe…
This demo shows you how to set up the containerized NetScaler CPX with NetScaler Management and Analytics System in a non-routable Mesos/Marathon environment for use with Micro-Services applications.
This video demonstrates how to create an example email signature rule for a department in a company using CodeTwo Exchange Rules. The signature will be inserted beneath users' latest emails in conversations and will be displayed in users' Sent Items…

932 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

16 Experts available now in Live!

Get 1:1 Help Now