?
Solved

UTF8 to EBCDIC conversion help - Cp420

Posted on 2006-05-21
14
Medium Priority
?
1,595 Views
Last Modified: 2012-06-21
I am trying to read an Arabic string from a UTF-8 file and then convert the string into EBCDIC - Cp420 charset. I've been struggling with this and any inputs on what is the best way to do it with references/links to sample source code will really help.

I've tried using a BufferedReader and then encoding the values, as well as read a UTF8 string and then use the getBytes("Cp420") invocation to return my requirement, but to no avail. I believe I am missing something here but can't put a finger as to what exactly it is.

Incidentally, when I use the Charsets.availableCharsets() method, I do not see the Cp420 charset on the output - appreciate any assistance regarding this.

Cheers,
Sandil
0
Comment
Question by:itsandil
  • 7
  • 5
12 Comments
 
LVL 86

Expert Comment

by:CEHJ
ID: 16727976
BufferedReader in = new BufferedReader(new InputStreamReader(new FileInputStream("arabic.txt"), "UTF8")));
Writer out = new OutputStreamWriter(new FileOutputStream("arabic-ebcdic.txt"), "Cp420");
int buf = -1;
while((buf = in.read()) > -1) {
      out.write(buf);
}
// Close all
0
 
LVL 3

Author Comment

by:itsandil
ID: 16728121
The data in the output file seems to be garbled.

I tried replacing the OutputStreamWriter with a ByteArrayOutputStream and then writing the contents of the resulting byte array to console/file/dataqueue on iSeries but they resulted in a blank string.

I suspect the Cp420 charset might not be included as part of my JDK installation - is there anyway to confirm this? I do not find it in the rt.jar as well.

Cheers,
Sandil
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 16728129
>>my JDK installation

... which is ..?
0
Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 86

Expert Comment

by:CEHJ
ID: 16728138
sun/io/ByteToCharCp420.class IN C:\j2sdk1.4.2_09\.\jre\lib\charsets.jar
sun/io/CharToByteCp420.class IN C:\j2sdk1.4.2_09\.\jre\lib\charsets.jar


sun/io/ByteToCharDBCS_EBCDIC.class IN C:\j2sdk1.4.2_09\.\jre\lib\charsets.jar
sun/io/CharToByteDBCS_EBCDIC.class IN C:\j2sdk1.4.2_09\.\jre\lib\charsets.jar

Please post your current code
0
 
LVL 3

Author Comment

by:itsandil
ID: 16728228
JDK installation is 1.4.2. I can see the CharToByteCp420.class in the charsets.jar, but when I try to create a Charset for encoding Cp420 I am returned with an unsupported charset exception, hence I've reworked it to play purely with the BufferedReader and the OutputStreamWriter objects.

My current code is as follows:

import java.io.*;
import java.nio.*;
import java.nio.charset.Charset;
import com.ibm.as400.access.*;

public class ArabicStringToEBCDIC
{

        public static void main (String args[])
        {
                try
                {
                        // System.out.println(Charset.availableCharsets());
                        AS400 as400System = null;
                        as400System = new AS400();
                        as400System.setSystemName("172.16.5.11");
                        as400System.setUserId("XXX");
                        as400System.setPassword("XXX");
                        as400System.connectService(AS400.DATAQUEUE);
                        DataQueue dq = new DataQueue(as400System , "/XXX.LIB/XXX.LIB/XXX.DTAQ");
                        dq.clear();

                        BufferedReader in = new BufferedReader(new InputStreamReader(new FileInputStream("arabic.txt"), "UTF8"));
                        ByteArrayOutputStream baos = new ByteArrayOutputStream();
                        Writer out = new OutputStreamWriter(baos , "Cp420");
                        int buf = -1;
                        while((buf = in.read()) > -1)
                        {
                                out.write(buf);
                        }

                        System.out.println(baos.toByteArray()); // throws a value to console, [B@1f14ceb
                        System.out.println(baos.size()); // returns 0
                        dq.write(baos.toByteArray()); // throws an exception as the byte array size is zero, surprisingly

                }
                catch(Exception e)
                {
                        e.printStackTrace();
                }

        }

}


NOTE: I also tried your method using the file output stream as follows:

Writer out = new OutputStreamWriter(new FileOutputStream("arabic-ebcdic.txt"), "Cp420");
...
... // do the conversion
...
out.close();

and then reading the resulting file and writing it to the dataQ which again resulted in a zero length string being registered in the queue.

I don't see any issues in the queue as I am able to do a pure EBCDIC to EBCDIC transmission. If only I can find out why Cp420 is not part of my available charsets and why I'm not able to create a Charset object for that encoding.
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 16728242
>>BufferedReader in = new BufferedReader(new InputStreamReader(new FileInputStream("arabic.txt"), "UTF8"));

Have you arranged for the existence of that file? If yes, can you please upload it to this site?
0
 
LVL 3

Author Comment

by:itsandil
ID: 16728262
Yes, I have arranged for the existence of this file.

You can download the arabic.txt file from this link
http://labs.sandil.com/support/ee/Java_Q_21858234/arabic.txt

And the resulting EBCDIC converted file at
http://labs.sandil.com/support/ee/Java_Q_21858234/arabic-ebcdic.txt

Cheers,
Sandil
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 16728273
OK. What makes you think that that Arabic is encodable as EBCDIC? AFAIK, the latter is just an old species of ASCII ...
0
 
LVL 3

Author Comment

by:itsandil
ID: 16728293
http://publib.boulder.ibm.com/infocenter/txformp/v6r0m0/index.jsp?topic=/com.ibm.cics.te.doc/erziad0058.htm suggests that the 420 codepage/charset is what is used to encode Arabic characters on the i-Series platform in EBCDIC.

We have applications that run within the AS/400 environment which capture, store and manage arabic information in EBCDIC, hence I should believe that it is encodable. I am looking for a means of conversion, even if it is native to As/400 or means running my Java code in the AS/400 environment.
0
 
LVL 86

Accepted Solution

by:
CEHJ earned 1000 total points
ID: 16729532
>>If only I can find out why Cp420 is not part of my available charsets and why I'm not able to create a Charset object for that encoding.

You need to have installed Additional Character Set support with the JRE. If it's any consolation, using Cp420 to translate out of the source file you posted produces garbage (mainly question marks)
0
 
LVL 3

Author Comment

by:itsandil
ID: 16731645
>> You need to have installed Additional Character Set support

Any tips on how to go about this? Or will the JDK 1.5.x release (multi-lingual / international edition) sort it out?

Cheers,
Sandil
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 16732116
Well you can try installing multi-lingual support with the installer you already have (if you've kept it) but if you can update, i would
0

Featured Post

Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

INTRODUCTION Working with files is a moderately common task in Java.  For most projects hard coding the file names, using parameters in configuration files, or using command-line arguments is sufficient.   However, when your application has vi…
This was posted to the Netbeans forum a Feb, 2010 and I also sent it to Verisign. Who didn't help much in my struggles to get my application signed. ------------------------- Start The idea here is to target your cell phones with the correct…
Viewers will learn about basic arrays, how to declare them, and how to use them. Introduction and definition: Declare an array and cover the syntax of declaring them: Initialize every index in the created array: Example/Features of a basic arr…
This tutorial covers a practical example of lazy loading technique and early loading technique in a Singleton Design Pattern.
Suggested Courses
Course of the Month13 days, 20 hours left to enroll

807 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question