Solved

Java to convert bytes into String

Posted on 2011-02-22
12
547 Views
Last Modified: 2012-05-11
Hi,

My java application has read a Chinese web page into bytes[], and has detected the encoding of the web page.

bytes[] input = readfileintobyte(File file);
String enc = detectEncoding(input);

How can i convert the input into a "utf-8" encoded string?

I have tried the following but it doesn't work:

String newinput = new String(input, "utf-8");

Seems like that the enc variable should be used to convert to utf-8 string. But how?
0
Comment
Question by:wsyy
  • 6
  • 5
12 Comments
 
LVL 92

Expert Comment

by:objects
ID: 34958606
String newinput = new String(input, enc);
0
 
LVL 92

Expert Comment

by:objects
ID: 34958617
> How can i convert the input into a "utf-8" encoded string?

theres no such thing in java, there is only a utf8 encoded byte array

String newinput = new String(input, enc);
byte[] utf8 = newinput.getBytes("UTF8");
0
 

Author Comment

by:wsyy
ID: 34958642
objects: the first solution works.

i just wonder why "utf-8" is not used in the solution?

in addition, what if i want to convert newinput into the original encoding enc?
0
Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 92

Expert Comment

by:objects
ID: 34958672
explained that in the 2nd comment.
java strings do not really have an encoding. its the byte array that has a specific encoding.
ie. the byte array contains the string encoded with a specific charset
0
 
LVL 92

Expert Comment

by:objects
ID: 34958679
> in addition, what if i want to convert newinput into the original encoding enc?


byte[] original = newinput.getBytes(enc);
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 34960418
>>
objects: the first solution works.

i just wonder why "utf-8" is not used in the solution?
>>

Because utf-8 is not being used as the encoding. The encoding being used is the original encoding.
It's not clear what your goal is, but if it's to get a byte array with utf-8 encoding, then you need to transcode
0
 

Author Comment

by:wsyy
ID: 34961997
CEHJ, i do want to save the contents in utf-8. will the following code be ok?

String newinput = new String(input, enc);
byte[] utf8 = newinput.getBytes("UTF8");
String utf8_newinput = new String(utf8, "UTF8");


If the code is good, can i make it simpler?

0
 

Author Comment

by:wsyy
ID: 34962129
I just check, and my above code doesn't work.

CEHJ, how can I do the transcode?
0
 
LVL 92

Expert Comment

by:objects
ID: 34964497
>, i do want to save the contents in utf-8. will the following code be ok?

No, as I already explained above.
If you want to save the contents in utf8 then you need to save the byte array, *not* a string

String newinput = new String(input, enc);
byte[] utf8 = newinput.getBytes("UTF8");

// save utf8 encoded byte array to a file
0
 

Author Comment

by:wsyy
ID: 34965697
objects, when i save the utf8 encoded byte array to a file, i don't need to specify any encoding, right?

do you have a quick example about saving utf8 byte array to a file?

in addition, if i want to do something on the string which is based on the original byte array (encoded in enc), how can I do so that the chinese characters inside the byte array can be properly handled.

thanks
0
 
LVL 92

Accepted Solution

by:
objects earned 125 total points
ID: 34965758
> objects, when i save the utf8 encoded byte array to a file, i don't need to specify any encoding, right?

no, its just a byte array (its the content of the byte array that are already encoded)

> do you have a quick example about saving utf8 byte array to a file?

FilOutputStream out = new FileOutputStream(out);
out.write(utf8);
out.close();

>  if i want to do something on the string which is based on the original byte array (encoded in enc), how can I do so that the chinese characters inside the byte array can be properly handled.

theres nothing you need to do
0
 

Author Closing Comment

by:wsyy
ID: 34965861
Excellent!
0

Featured Post

Free Tool: ZipGrep

ZipGrep is a utility that can list and search zip (.war, .ear, .jar, etc) archives for text patterns, without the need to extract the archive's contents.

One of a set of tools we're offering as a way to say thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Java functions are among the best things for programmers to work with as Java sites can be very easy to read and prepare. Java especially simplifies many processes in the coding industry as it helps integrate many forms of technology and different d…
In this post we will learn how to make Android Gesture Tutorial and give different functionality whenever a user Touch or Scroll android screen.
Viewers will learn about the regular for loop in Java and how to use it. Definition: Break the for loop down into 3 parts: Syntax when using for loops: Example using a for loop:
The viewer will learn how to implement Singleton Design Pattern in Java.

749 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question