asked on

Foreign characters

I can type Russian, Hebrew etc into a JTextField and JTextArea and they display ok, but sending the strings over a socket seems to destroy them and they become question marks. What readers / writers / streams are best to use in this case?

Mick Barry

use a Writer and check the problem isn't with whats reading the data at the other end.

afterburner

ASKER

At the other end is always a buffered reader that reads the string. Or would the problem be in something else?

SOLUTION

Mick Barry

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

afterburner

ASKER

I tried sending the string as a string captured from the JTextField (JTextField.getTExt();), then tried it as a new String using UTF, UTF-8, UTF-16, UTF-16BE, UTF-16LE, etc., and none of these worked, although with 16BE I think it was, the displayed return string looked a bit more promising, as it was not simply question marks, but I guess this is neither here nor there.

As I am using loopback, yes, the 'other end' is also using a font that supports the charset.

ASKER CERTIFIED SOLUTION

Al-Khwarizmi

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

CEHJ

>>and they become question marks

Where?

afterburner

ASKER

>> be sure to pass it a Reader object configured ...

that sounds like a good idea - I will try that.

>> Where? ...

in the displaying JTextArea which receives the returned string.

CEHJ

>>in the displaying JTextArea which receives the returned string.

That's OK then

>>that sounds like a good idea - I will try that.

Let us know if it doesn't work. Use UTF-8 unless you have a specific reason not to

afterburner

ASKER

>> Use UTF-8 unless you have a specific reason not to ...

Does the printerwriter *and* the reader have to be configured with the encoding?

Al-Khwarizmi

If you use a PrintWriter too, the situation is similar:

PrintWriter myPrintWriter = new PrintWriter ( new OutputStreamWriter ( myOutputStream , "UTF-16" ) );

would be a good match for the reader above, if the input and output streams are connected as is the case with a socket.

SOLUTION

CEHJ

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

afterburner

ASKER

OK, it will take me a little while to change things. Two small points meanwhile - why do you mention UTF-8 and AL-Khwarizmi UTF-16 ? ; and secondly, will the encoding still allow the transport of ASCII chars as is the case with the existing Printwriter and Reader that I'm using?

CEHJ

>>why do you mention UTF-8

It's the most economical way of transmitting Unicode

>>will the encoding still allow the transport of ASCII chars as is the case with the existing Printwriter and Reader that I'm using?

Yes - 'ascii' is a subset of Unicode

afterburner

ASKER

That's it. Thanks 4 ur help(s).

CEHJ

8-)

Al-Khwarizmi

I only mentioned UTF-16 as an example, sorry if that confused you. UTF-8, as CEHJ says, is probably a better option in your case. UTF-16 is more suited for heavy use of Asian languages, where it can save space.

Although I don't think you need it, if you want to learn more about unicode formats you can check http://www-106.ibm.com/developerworks/library/utfencodingforms/index.html?dwzone=unicode

afterburner

ASKER

>> UTF-16 is more suited for heavy use of Asian languages ...

in fact, that is exactly where it comes in - but at least I know now, and appreciate your help v. much.

CEHJ

>>
UTF-8, as CEHJ says, is probably a better option in your case. UTF-16 is more suited for heavy use of Asian languages, where it can save space.
>>

Yes you're right. That comment of mine could have been quite misleading. I think it's better to say UTF-8 is more economical when there are mixed 'ascii' and higher characters

afterburner

ASKER

Tell me if I need to open another question for this, but I have been trying it now with Japanese and Korean, and all I get are little squares in the JTextArea and JTextFields. Would any of you have anything to say on that? It's these languages - plus Chinese - that I need in particular.

Mick Barry

> and all I get are little squares in the JTextArea and JTextFields.

Are you using a font that supports the characters being used?

afterburner

ASKER

>> Are you using a font that supports the characters being used?

I am, because I can type Japanese into Word - although I admit I am not sure of the role of IME in this, and fear it may be some kind of a 'closed interface', just for M$ purposes, which Java can't share.

Mick Barry

Is the same process involved transferring the string, and if so what encoding are you using.

afterburner

ASKER

Yes, it is the same process exactly. I tried with UTF-8 first and it didnt work so I tried UTF-16LE for no good reason :) which of course didnt work either.

afterburner

ASKER

I think (but I am not sure) that I might have seen another choice apart from IME when installing the Asian support; if that's the case should I unistall IME and try another one?

afterburner

ASKER

Ah, sorry, my comment about using a Font that supports Jap may not be right exactly, since I might of course not be using the same one in Word as I am in the Java app JTextField. Should I check that or is it a red herring?

CEHJ

You need to do something like:

textField.setFont(new Font("Batang", Font.PLAIN, 12));