[Webinar] Streamline your web hosting managementRegister Today

x
?
Solved

Java to convert bytes into String

Posted on 2011-02-22
12
Medium Priority
?
561 Views
Last Modified: 2012-05-11
Hi,

My java application has read a Chinese web page into bytes[], and has detected the encoding of the web page.

bytes[] input = readfileintobyte(File file);
String enc = detectEncoding(input);

How can i convert the input into a "utf-8" encoded string?

I have tried the following but it doesn't work:

String newinput = new String(input, "utf-8");

Seems like that the enc variable should be used to convert to utf-8 string. But how?
0
Comment
Question by:wsyy
  • 6
  • 5
12 Comments
 
LVL 92

Expert Comment

by:objects
ID: 34958606
String newinput = new String(input, enc);
0
 
LVL 92

Expert Comment

by:objects
ID: 34958617
> How can i convert the input into a "utf-8" encoded string?

theres no such thing in java, there is only a utf8 encoded byte array

String newinput = new String(input, enc);
byte[] utf8 = newinput.getBytes("UTF8");
0
 

Author Comment

by:wsyy
ID: 34958642
objects: the first solution works.

i just wonder why "utf-8" is not used in the solution?

in addition, what if i want to convert newinput into the original encoding enc?
0
Never miss a deadline with monday.com

The revolutionary project management tool is here!   Plan visually with a single glance and make sure your projects get done.

 
LVL 92

Expert Comment

by:objects
ID: 34958672
explained that in the 2nd comment.
java strings do not really have an encoding. its the byte array that has a specific encoding.
ie. the byte array contains the string encoded with a specific charset
0
 
LVL 92

Expert Comment

by:objects
ID: 34958679
> in addition, what if i want to convert newinput into the original encoding enc?


byte[] original = newinput.getBytes(enc);
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 34960418
>>
objects: the first solution works.

i just wonder why "utf-8" is not used in the solution?
>>

Because utf-8 is not being used as the encoding. The encoding being used is the original encoding.
It's not clear what your goal is, but if it's to get a byte array with utf-8 encoding, then you need to transcode
0
 

Author Comment

by:wsyy
ID: 34961997
CEHJ, i do want to save the contents in utf-8. will the following code be ok?

String newinput = new String(input, enc);
byte[] utf8 = newinput.getBytes("UTF8");
String utf8_newinput = new String(utf8, "UTF8");


If the code is good, can i make it simpler?

0
 

Author Comment

by:wsyy
ID: 34962129
I just check, and my above code doesn't work.

CEHJ, how can I do the transcode?
0
 
LVL 92

Expert Comment

by:objects
ID: 34964497
>, i do want to save the contents in utf-8. will the following code be ok?

No, as I already explained above.
If you want to save the contents in utf8 then you need to save the byte array, *not* a string

String newinput = new String(input, enc);
byte[] utf8 = newinput.getBytes("UTF8");

// save utf8 encoded byte array to a file
0
 

Author Comment

by:wsyy
ID: 34965697
objects, when i save the utf8 encoded byte array to a file, i don't need to specify any encoding, right?

do you have a quick example about saving utf8 byte array to a file?

in addition, if i want to do something on the string which is based on the original byte array (encoded in enc), how can I do so that the chinese characters inside the byte array can be properly handled.

thanks
0
 
LVL 92

Accepted Solution

by:
objects earned 500 total points
ID: 34965758
> objects, when i save the utf8 encoded byte array to a file, i don't need to specify any encoding, right?

no, its just a byte array (its the content of the byte array that are already encoded)

> do you have a quick example about saving utf8 byte array to a file?

FilOutputStream out = new FileOutputStream(out);
out.write(utf8);
out.close();

>  if i want to do something on the string which is based on the original byte array (encoded in enc), how can I do so that the chinese characters inside the byte array can be properly handled.

theres nothing you need to do
0
 

Author Closing Comment

by:wsyy
ID: 34965861
Excellent!
0

Featured Post

The new generation of project management tools

With monday.com’s project management tool, you can see what everyone on your team is working in a single glance. Its intuitive dashboards are customizable, so you can create systems that work for you.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Introduction Java can be integrated with native programs using an interface called JNI(Java Native Interface). Native programs are programs which can directly run on the processor. JNI is simply a naming and calling convention so that the JVM (Java…
Introduction This article is the first of three articles that explain why and how the Experts Exchange QA Team does test automation for our web site. This article explains our test automation goals. Then rationale is given for the tools we use to a…
Viewers learn about the “for” loop and how it works in Java. By comparing it to the while loop learned before, viewers can make the transition easily. You will learn about the formatting of the for loop as we write a program that prints even numbers…
This theoretical tutorial explains exceptions, reasons for exceptions, different categories of exception and exception hierarchy.
Suggested Courses
Course of the Month9 days, 19 hours left to enroll

591 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question