Solved

byte[] number = new UTF8Encoding(true).GetBytes(i.ToString());  trying to get an explanation what this line of code do...

Posted on 2014-10-01
5
277 Views
Last Modified: 2014-10-02
trying to get an explanation what this line of code do...

byte[] number = new UTF8Encoding(true).GetBytes(i.ToString());
0
Comment
Question by:yguyon28
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
5 Comments
 
LVL 34

Assisted Solution

by:it_saige
it_saige earned 167 total points
ID: 40355736
You are taking an object [i] and calling the ToString method for this object.  Once the string representation of the object [i] is returned, you encode the string of characters into a sequence of bytes.

From the MSDN Article:
using System;
using System.Text;

public class SamplesEncoding  {

   public static void Main()  {

      // The characters to encode: 
      //    Latin Small Letter Z (U+007A) 
      //    Latin Small Letter A (U+0061) 
      //    Combining Breve (U+0306) 
      //    Latin Small Letter AE With Acute (U+01FD) 
      //    Greek Small Letter Beta (U+03B2) 
      //    a high-surrogate value (U+D8FF) 
      //    a low-surrogate value (U+DCFF)
      String myStr = "za\u0306\u01FD\u03B2\uD8FF\uDCFF";

      // Get different encodings.
      Encoding  u7    = Encoding.UTF7;
      Encoding  u8    = Encoding.UTF8;
      Encoding  u16LE = Encoding.Unicode;
      Encoding  u16BE = Encoding.BigEndianUnicode;
      Encoding  u32   = Encoding.UTF32;

      // Encode the entire string, and print out the counts and the resulting bytes.
      Console.WriteLine( "Encoding the entire string:" );
      PrintCountsAndBytes( myStr, u7 );
      PrintCountsAndBytes( myStr, u8 );
      PrintCountsAndBytes( myStr, u16LE );
      PrintCountsAndBytes( myStr, u16BE );
      PrintCountsAndBytes( myStr, u32 );

      Console.WriteLine();

      // Encode three characters starting at index 4, and print out the counts and the resulting bytes.
      Console.WriteLine( "Encoding the characters from index 4 through 6:" );
      PrintCountsAndBytes( myStr, 4, 3, u7 );
      PrintCountsAndBytes( myStr, 4, 3, u8 );
      PrintCountsAndBytes( myStr, 4, 3, u16LE );
      PrintCountsAndBytes( myStr, 4, 3, u16BE );
      PrintCountsAndBytes( myStr, 4, 3, u32 );

   }


   public static void PrintCountsAndBytes( String s, Encoding enc )  {

      // Display the name of the encoding used.
      Console.Write( "{0,-30} :", enc.ToString() );

      // Display the exact byte count. 
      int iBC  = enc.GetByteCount( s );
      Console.Write( " {0,-3}", iBC );

      // Display the maximum byte count. 
      int iMBC = enc.GetMaxByteCount( s.Length );
      Console.Write( " {0,-3} :", iMBC );

      // Encode the entire string. 
      byte[] bytes = enc.GetBytes( s );

      // Display all the encoded bytes.
      PrintHexBytes( bytes );

   }

   public static void PrintCountsAndBytes( String s, int index, int count, Encoding enc )  {

      // Display the name of the encoding used.
      Console.Write( "{0,-30} :", enc.ToString() );

      // Display the exact byte count. 
      int iBC  = enc.GetByteCount( s.ToCharArray(), index, count );
      Console.Write( " {0,-3}", iBC );

      // Display the maximum byte count. 
      int iMBC = enc.GetMaxByteCount( count );
      Console.Write( " {0,-3} :", iMBC );

      // Encode a range of characters in the string. 
      byte[] bytes = new byte[iBC];
      enc.GetBytes( s, index, count, bytes, bytes.GetLowerBound(0) );

      // Display all the encoded bytes.
      PrintHexBytes( bytes );

   }


   public static void PrintHexBytes( byte[] bytes )  {

      if (( bytes == null ) || ( bytes.Length == 0 ))
         Console.WriteLine( "<none>" );
      else  {
         for ( int i = 0; i < bytes.Length; i++ )
            Console.Write( "{0:X2} ", bytes[i] );
         Console.WriteLine();
      }

   }

}


/* 
This code produces the following output.

Encoding the entire string:
System.Text.UTF7Encoding       : 18  23  :7A 61 2B 41 77 59 42 2F 51 4F 79 32 50 2F 63 2F 77 2D
System.Text.UTF8Encoding       : 12  24  :7A 61 CC 86 C7 BD CE B2 F1 8F B3 BF
System.Text.UnicodeEncoding    : 14  16  :7A 00 61 00 06 03 FD 01 B2 03 FF D8 FF DC
System.Text.UnicodeEncoding    : 14  16  :00 7A 00 61 03 06 01 FD 03 B2 D8 FF DC FF
System.Text.UTF32Encoding      : 24  32  :7A 00 00 00 61 00 00 00 06 03 00 00 FD 01 00 00 B2 03 00 00 FF FC 04 00

Encoding the characters from index 4 through 6:
System.Text.UTF7Encoding       : 10  11  :2B 41 37 4C 59 2F 39 7A 2F 2D
System.Text.UTF8Encoding       : 6   12  :CE B2 F1 8F B3 BF
System.Text.UnicodeEncoding    : 6   8   :B2 03 FF D8 FF DC
System.Text.UnicodeEncoding    : 6   8   :03 B2 D8 FF DC FF
System.Text.UTF32Encoding      : 8   16  :B2 03 00 00 FF FC 04 00

*/

Open in new window


http://msdn.microsoft.com/en-us/library/ds4kkd55(v=vs.110).aspx

-saige-
0
 
LVL 63

Assisted Solution

by:Fernando Soto
Fernando Soto earned 166 total points
ID: 40356252
Hi yguyon28;

The explanation the the statement:

byte[] number = new UTF8Encoding(true).GetBytes(i.ToString());

The UTF8Encoding object creates an object that converts one type of encoding to another. The true parameter that is passed in tells the class to provide a BOM, Byte Oder Mark, at the beginning of the byte of characters. The BOM tells the object that receives it how the bytes are constructed whether they are in big-endian or little-endian order, there are other meaning if the BOM is found in the middle of the byte array, see the link for BOM for more info. The GetBytes method takes the string passed in as a parameter and returns an array of bytes.  This is done so that the system receiving it knows how to correctly interpret the byte array. Therefore the number variable is a byte array with a byte order mark at the beginning followed by the bytes.
0
 
LVL 34

Accepted Solution

by:
sarabande earned 167 total points
ID: 40356781
to add to above comments:

if i is an integer, say 12345, it would be turned to a byte array <BOM>+"12345", what is

EF BB BF 31 32 33 34 35

in hex digits. the 3-byte BOM says (among other things, see also comment of Fernando) that the following is UTF-8 encoded. windows text files which begin with a BOM would be correctly encoded (displayed) by most editors and text processing programs. if using a dump editor, the BOM and multi-byte UTF-8 characters would show non-printable or wrong characters. the BOM is only needed at the begin of a file, so I would assume that the statement was to create the start or header sequence of a text file.

note, integer digits are ascii codes where UTF-8 has same coding. that could be different if 'i' is not an integer but for example a currency object where the currency sign is a non-ascii (for example £ or €).

Sara
0
 
LVL 34

Expert Comment

by:sarabande
ID: 40356975
yguon28, before closing a question with a 'B' grade, you may consider to ask for anything which has been not answered or was unclear. I think it is fair to give us volunteers a chance to go for an 'A'. thanks.

Sara
0
 

Author Comment

by:yguyon28
ID: 40357063
Will do Sara
0

Featured Post

What does it mean to be "Always On"?

Is your cloud always on? With an Always On cloud you won't have to worry about downtime for maintenance or software application code updates, ensuring that your bottom line isn't affected.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This document covers how to connect to SQL Server and browse its contents.  It is meant for those new to Visual Studio and/or working with Microsoft SQL Server.  It is not a guide to building SQL Server database connections in your code.  This is mo…
Introduction Hi all and welcome to my first article on Experts Exchange. A while ago, someone asked me if i could do some tutorials on object oriented programming. I decided to do them on C#. Now you may ask me, why's that? Well, one of the re…
Michael from AdRem Software outlines event notifications and Automatic Corrective Actions in network monitoring. Automatic Corrective Actions are scripts, which can automatically run upon discovery of a certain undesirable condition in your network.…
Visualize your data even better in Access queries. Given a date and a value, this lesson shows how to compare that value with the previous value, calculate the difference, and display a circle if the value is the same, an up triangle if it increased…

630 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question