Transliteration of non-ASCII7 Character

Posted on 2005-04-06
Last Modified: 2008-03-10

Flat files are created on the unix server having charset = " ISO-08859-5"
These files are ftp to the unix server having charset = " ISO-08859-1"

I need to transliterate “преә” to “pred”

We have two created mapping files
one holds the cyrillic characters and the other the corresponding english.

The problem that I am facing is to recognise the characters in  ISO-08859-1 character set.
On the unix server if i try to read the mapping file the characters cannot be read correctly.
also the problem is with file to be transliterated.

One solution we tried was to use the binary value of the characters and try to transliteration in Java. this did not work.

Also i tried to do the transliteration using a sed script file. this approach failed as well.

The unix server where the flat file need to transliterated does not support charset = " ISO-08859-5"

thanks a lot in advance
Please let me know if anyone has worked on character transliteration.

Question by:nimhan
    LVL 3

    Expert Comment

    Did u try BufferedInputStream to read the mapping file??

    Btw, please have a look in the following link but I am not aware of how much it will help u to solve ur problem (Example codes are in C)

    LVL 86

    Accepted Solution

    >>The problem that I am facing is to recognise the characters in  ISO-08859-1 character set.

    You need to read the file in the correct encoding first

    Reader in = new InputStreamReader(new FileInputStream("yourfile.txt"), "ISO-8859-5");

    The fact that the file has been FTPd from one server to the other should not be allowed to affect the encoding. Use binary mode
    LVL 3

    Expert Comment

    Try -Dsun.jnu.encoding=iso8859-5 to set the default file encoding for the JVM to your choice.
    LVL 3

    Expert Comment

    Other Options: Try iso8859-5 instead of iso08859-5 (Java does'nt recognize the encoding with the extra zero!)
    LVL 3

    Expert Comment

    One more option: Try setting -Dfile.encoding=iso8859-5 when starting the jvm
    LVL 86

    Expert Comment


    Featured Post

    What Is Threat Intelligence?

    Threat intelligence is often discussed, but rarely understood. Starting with a precise definition, along with clear business goals, is essential.

    Join & Write a Comment

    Suggested Solutions

    Title # Comments Views Activity
    twoTwo  challenge 35 74
    Connect to IP Camera using Java 4 64
    copyEvens challenge 6 47
    for i loop in grovy 1 26
    Java had always been an easily readable and understandable language.  Some relatively recent changes in the language seem to be changing this pretty fast, and anyone that had not seen any Java code for the last 5 years will possibly have issues unde…
    Basic understanding on "OO- Object Orientation" is needed for designing a logical solution to solve a problem. Basic OOAD is a prerequisite for a coder to ensure that they follow the basic design of OO. This would help developers to understand the b…
    Viewers will learn about arithmetic and Boolean expressions in Java and the logical operators used to create Boolean expressions. We will cover the symbols used for arithmetic expressions and define each logical operator and how to use them in Boole…
    Viewers will learn about if statements in Java and their use The if statement: The condition required to create an if statement: Variations of if statements: An example using if statements:

    746 members asked questions and received personalized solutions in the past 7 days.

    Join the community of 500,000 technology professionals and ask your questions.

    Join & Ask a Question

    Need Help in Real-Time?

    Connect with top rated Experts

    16 Experts available now in Live!

    Get 1:1 Help Now