[Okta Webinar] Learn how to a build a cloud-first strategyRegister Now

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 811
  • Last Modified:

Java: Unicode Bidirectional Text

Hi Experts,

I need your help, I have a text file (code page = Windows-1255) and I need to convert it to unicode (UTF-8) to be processed by our system.

I used the method in the attached  code to do the conversion: (VB.Net)


the conversion create problem because some lines can have one part that is right-to-left and another can be left-to-right. (and the Hebrow words will be written in reverse order!)

anyhow, I know there is a Java class that is supposed to solve this problem and all I need is an example of how to use this class to covert a text file.

this is the class : java.text.Bidi
http://java.sun.com/j2se/1.4.2/docs/api/java/text/Bidi.html#reorderVisually%28byte[],%20int,%20java.lang.Object[],%20int,%20int%29

I will appreciate any solution to this problem using either .Net or Java.

Thanks



 

 


 
public void Convert1255toUnicode(){

        //open ASCII text file and read it using Windows-1255 code page into a UTF-8 string

        using (System.IO.StreamReader s = new System.IO.StreamReader("c:\\temp\\AsciiInputFile.txt", System.Text.Encoding.GetEncoding("Windows-1255"), true))

        {

            string  utf8string = s.ReadToEnd();

            //Write the string into a Unicode file

            Microsoft.VisualBasic.FileIO.FileSystem.WriteAllText("c:\\temp\\UnicodeOutputFile.txt", utf8string, false, System.Text.Encoding.Unicode);



        }
             

    }

Open in new window

0
Misbah
Asked:
Misbah
1 Solution
 
MisbahAuthor Commented:
0
 
josephtsangCommented:
This is my Java version for your reference.

public void Convert1255toUnicode()
{
	BufferedReader reader = null;
	PrintStream writer = null;

	try
	{
		// Prepare a reader to read the contents from a Windows-1255 code page
		reader = new BufferedReader(new InputStreamReader(new FileInputStream("c:\\temp\\AsciiInputFile.txt"), "windows-1255"));

		// Prepare a writer to write the contents to a UTF-8 file
		writer = new PrintStream(new FileOutputStream("c:\\temp\\UnicodeOutputFile.txt"), true, "UTF-8");

		// Loop to read each line from the Windows-1255 code page
		for (String line = null; (line = reader.readLine()) != null; )
		{
			//Write the string into a Unicode file
			writer.println(line);
		}
	}

	// Close reader/write, and do a graceful exit.
	finally
	{
		if (reader != null)
		{
			try {
				reader.close();
			} finally {}
		}
		if (writer != null)
		{
			try {
				writer.close();
			} finally {}
		}
	}
}

Open in new window

0

Featured Post

Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Tackle projects and never again get stuck behind a technical roadblock.
Join Now