Link to home
Start Free TrialLog in
Avatar of afterburner
afterburner

asked on

Foreign characters

I can type Russian, Hebrew etc into a JTextField and JTextArea and they display ok, but sending the strings over a socket seems to destroy them and they become question marks. What readers / writers / streams are best to use in this case?
Avatar of Mick Barry
Mick Barry
Flag of Australia image

use a Writer and check the problem isn't with whats reading the data at the other end.
Avatar of afterburner
afterburner

ASKER

At the other end is always a buffered reader that reads the string. Or would the problem be in something else?
SOLUTION
Avatar of Mick Barry
Mick Barry
Flag of Australia image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
I tried sending the string as a string captured from the JTextField (JTextField.getTExt();), then tried it as a new String using UTF, UTF-8, UTF-16, UTF-16BE, UTF-16LE, etc., and none of these worked, although with 16BE I think it was, the displayed return string looked a bit more promising, as it was not simply question marks, but I guess this is neither here nor there.

As I am using loopback, yes, the 'other end' is also using a font that supports the charset.
ASKER CERTIFIED SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
>>and they become question marks

Where?
>> be sure to pass it a Reader object configured  ...

that sounds like a good idea - I will try that.

>> Where?  ...

in the displaying JTextArea which receives the returned string.
>>in the displaying JTextArea which receives the returned string.

That's OK then

>>that sounds like a good idea - I will try that.

Let us know if it doesn't work. Use UTF-8 unless you have a specific reason not to
>> Use UTF-8 unless you have a specific reason not to ...

Does the printerwriter *and* the reader have to be configured with the encoding?
If you use a PrintWriter too, the situation is similar:

PrintWriter myPrintWriter = new PrintWriter ( new OutputStreamWriter ( myOutputStream , "UTF-16" ) );

would be a good match for the reader above, if the input and output streams are connected as is the case with a socket.
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
OK, it will take me a little while to change things. Two small points meanwhile  - why do you mention UTF-8 and AL-Khwarizmi UTF-16 ? ; and secondly, will the encoding still allow the transport of ASCII chars as is the case with the existing Printwriter and Reader that I'm using?
>>why do you mention UTF-8

It's the most economical way of transmitting Unicode

>>will the encoding still allow the transport of ASCII chars as is the case with the existing Printwriter and Reader that I'm using?

Yes - 'ascii' is a subset of Unicode
That's it. Thanks 4 ur help(s).
8-)
I only mentioned UTF-16 as an example, sorry if that confused you. UTF-8, as CEHJ says, is probably a better option in your case. UTF-16 is more suited for heavy use of Asian languages, where it can save space.

Although I don't think you need it, if you want to learn more about unicode formats you can check http://www-106.ibm.com/developerworks/library/utfencodingforms/index.html?dwzone=unicode
>> UTF-16 is more suited for heavy use of Asian languages ...

in fact, that is exactly where it comes in - but at least I know now, and appreciate your help v. much.
>>
UTF-8, as CEHJ says, is probably a better option in your case. UTF-16 is more suited for heavy use of Asian languages, where it can save space.
>>

Yes you're right. That comment of mine could have been quite misleading. I think it's better to say UTF-8 is more economical when there are mixed 'ascii' and higher characters
Tell me if I need to open another question for this, but I have been trying it now with Japanese and Korean, and all I get are little squares in the JTextArea and JTextFields. Would any of you have anything to say on that? It's these languages - plus Chinese  - that I need in particular.
> and all I get are little squares in the JTextArea and JTextFields.

Are you using a font that supports the characters being used?
>> Are you using a font that supports the characters being used?


I am, because I can type Japanese into Word - although I admit I am not sure of the role of IME in this, and fear it may be some kind of a 'closed interface', just for M$ purposes, which Java can't share.

Is the same process involved transferring the string, and if so what encoding are you using.
Yes, it is the same process exactly. I tried with UTF-8 first and it didnt work so I tried UTF-16LE for no good reason :) which of course didnt work either.
I think (but I am not sure) that I might have seen another choice apart from IME when installing the Asian support; if that's the case should I unistall IME and try another one?
Ah, sorry, my comment about using a Font that supports Jap may not be right exactly, since I might of course not be using the same one in Word as I am in the Java app JTextField. Should I check that or is it a red herring?
You need to do something like:

textField.setFont(new Font("Batang", Font.PLAIN, 12));