?
Solved

Converting UTF-8 to ISO-8859-1

Posted on 2006-05-30
22
Medium Priority
?
1,807 Views
Last Modified: 2008-01-09
Hi,
  I need good/best approach to convert string from  UTF-8 to ISO-8859-1. And ISO-8859-1 to UTF-8. I am reading UTF-8 String from xml.
karan.
0
Comment
Question by:Manish
  • 9
  • 8
  • 5
22 Comments
 
LVL 86

Expert Comment

by:CEHJ
ID: 16788160
You don't need to convert String - only files. Do you mean a file?
0
 
LVL 92

Expert Comment

by:objects
ID: 16788170
open the file using utf8 and save it using ISO88591, which part exactly are you having problems with?
0
 
LVL 11

Author Comment

by:Manish
ID: 16788248
I am reading xml file , which has UTF-8 string.I want to store it in ISO .Lets say  
‘Citizens for NYC’ Board
this is string in xml .
I want to store it in db which having charset  US7ASCII.
I cannt change database charset.
0
The new generation of project management tools

With monday.com’s project management tool, you can see what everyone on your team is working in a single glance. Its intuitive dashboards are customizable, so you can create systems that work for you.

 
LVL 86

Expert Comment

by:CEHJ
ID: 16788262
As i mentioned, you don't need to convert String. They are already ported/portable between encodings
0
 
LVL 92

Expert Comment

by:objects
ID: 16788267
try specifying the charset in the jdbc connection string
0
 
LVL 11

Author Comment

by:Manish
ID: 16788286
If I dont change ,
   in jsp it look like ‘Citizens for NYC’

How to specify charset in jdbc connection string.?
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 16788304
>>If I dont change ,
   in jsp it look like ‘Citizens for NYC’

Those left and right quotes are not supported in ISO8859-1
0
 
LVL 92

Expert Comment

by:objects
ID: 16788312
> If I dont change ,
>   in jsp it look like ‘Citizens for NYC’

sounds like you'd be better off handling it when you *read* the data from the database
0
 
LVL 11

Author Comment

by:Manish
ID: 16788331
I am using following string method to read and covert it into utf-8
getBytes(ENCODING_ISO_8859_1),ENCODING_UTF8)
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 16788340
As i just mentioned, shifting encodings won't help - those quotes are not supported in ISO8859-1. You need to do

s = s.replaceAll("[\u2018\u2019]", "'");

See

http://www.cl.cam.ac.uk/~mgk25/ucs/quotes.html
0
 
LVL 11

Author Comment

by:Manish
ID: 16788349
string.getBytes(ISO-8859-1),UTF-8)
0
 
LVL 11

Author Comment

by:Manish
ID: 16788362
>>s = s.replaceAll("[\u2018\u2019]", "'");

when should I use this method , while inserting or reading?
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 16788369
>>when should I use this method , while inserting or reading?

Before inserting - it will get rid of the unsupportable characters
0
 
LVL 11

Author Comment

by:Manish
ID: 16788390
Then how to find , character which is not supported , what should be equivalent to that char and store it in db.
  and is there any difference in output?
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 16788402
>>Then how to find , character which is not supported , what should be equivalent to that char and store it in db.

That's quite subjective - it would be down to you to find the ones that aren't supported and choose another you like better

>>and is there any difference in output?

Yes - they're quite different characters
0
 
LVL 92

Expert Comment

by:objects
ID: 16788515
0
 
LVL 11

Author Comment

by:Manish
ID: 16789238
So my steps should be,
read xml,
  read char by char , replace unsupported character with supported characters, store it in db,
  While reading , do I need to convert it in UTF to show on JSP?
0
 
LVL 86

Accepted Solution

by:
CEHJ earned 150 total points
ID: 16789244
Exactly

>>While reading , do I need to convert it in UTF to show on JSP?

No. It will already be converted by Java
0
 
LVL 11

Author Comment

by:Manish
ID: 16791519
So can you give one example, so that I can do it for all other characters.
If possible UTF-8 character list and ISO character list..
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 16791560
Well i gave you an example at http:Q_21867758.html#16788340

As i mentioned, the replacements are a matter of judgement. If you read that link i posted, for instance, you'll see that in the Unix world, people have the habit of using ` for a left quote - quite different to the replacement i suggested
0
 
LVL 92

Assisted Solution

by:objects
objects earned 150 total points
ID: 16795072
i posted the iso table earlier, here are details of utf8

http://www.tony-franks.co.uk/UTF-8.htm
http://www1.tip.nl/~t876506/utf8tbl.html
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 16888003
:-)
0

Featured Post

The 14th Annual Expert Award Winners

The results are in! Meet the top members of our 2017 Expert Awards. Congratulations to all who qualified!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

For beginner Java programmers or at least those new to the Eclipse IDE, the following tutorial will show some (four) ways in which you can import your Java projects to your Eclipse workbench. Introduction While learning Java can be done with…
Java contains several comparison operators (e.g., <, <=, >, >=, ==, !=) that allow you to compare primitive values. However, these operators cannot be used to compare the contents of objects. Interface Comparable is used to allow objects of a cl…
Viewers will learn about the different types of variables in Java and how to declare them. Decide the type of variable desired: Put the keyword corresponding to the type of variable in front of the variable name: Use the equal sign to assign a v…
Viewers will learn about basic arrays, how to declare them, and how to use them. Introduction and definition: Declare an array and cover the syntax of declaring them: Initialize every index in the created array: Example/Features of a basic arr…
Suggested Courses
Course of the Month4 days, 20 hours left to enroll

601 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question