We help IT Professionals succeed at work.

Write Russian chars in a xml file

PuneetKSaxena
on
Medium Priority
1,938 Views
Last Modified: 2012-05-06
HI ,

I want to read and write russian chars in a xml file from Java code
When i try , encoding="UTF-8", I'm getting ??????????  in xml file

How can i read and write russian chars in Java
Any help in this regard will be highly appreciated

Thanks and Regards
Puneet
Comment
Watch Question

Mick BarryJava Developer
CERTIFIED EXPERT
Top Expert 2010

Commented:
how are you viewing the xml?
make sure what you are viewing with and the font it is using supports russian

Mick BarryJava Developer
CERTIFIED EXPERT
Top Expert 2010

Commented:
can u post your current code

Save your xml-file as utf-8 and change the first line into:

<?xml version="1.0" encoding="UTF-8" ?>
Possible you were using a font before that didn't have those characters in it? Some fonts only have certain characters available.

Author

Commented:
Hi,

I'm reading a html file which contains russian chars, through URL connection (code snippet as below).
Now after parsing this html(which contains russian chars), i'm constrcuting a xml which contains russian chars.

When i view this xml, the russian chars are coming as ?????????, so i'm not able to understand anything from this.

Now i've to use this xml file in Adobe Flex, which again gives me ????????



Hope it clarifies
try {
urlIviewPath =new URL(strIviewURLPath+ "&j_user="+ strUserId "&j_password="+ strPassword);
urlIviewPath = Util.encodeUrl(urlIviewPath);//encode the russian chars in URL
urlConnection = urlIviewPath.openConnection();
urlConnection.connect();
br =new BufferedReader(new InputStreamReader(			urlConnection.getInputStream(), "UTF-8"));
} catch (RuntimeException e2) {
}

Open in new window

Mick BarryJava Developer
CERTIFIED EXPERT
Top Expert 2010

Commented:
are you sure its utf8, whats the encoding when you load with browser?
how do oyou write them, make sure you use UTF8 there as well

Author

Commented:
Whille writing xml i'm using
<?xml version="1.0"  encoding="UTF-8"  ?>.
margajet24IT Business Analyst

Commented:
try to use
<?xml version="1.0"  encoding="UTF-16"  ?>.
CERTIFIED EXPERT
Top Expert 2016

Commented:
You must read the page in the correct encoding and preserve that encoding when you write it to xml, which you should do with an OutputStreamWriter

http://java.sun.com/javase/6/docs/api/java/io/OutputStreamWriter.html#OutputStreamWriter(java.io.OutputStream,%20java.lang.String)

You need to view the results with a program that supports that encoding, and (afaik) that doesn't include a Windows console for UTF-8
final String ENCODING = urlConnection.getContentEncoding();
br = new BufferedReader(new InputStreamReader(urlConnection.getInputStream(), ENCODING));

Open in new window

Author

Commented:
Thats works for me . But now in xml instead of ???????? i'm getting chars like РуÑ?Ñ?кий/По типу

Any suggestion :)
Mick BarryJava Developer
CERTIFIED EXPERT
Top Expert 2010

Commented:
so you change the write encoding as I suggested earlier? If so, what did u change it to?
And have you checked how the page is encoded?

You haven't answered my earlier question about how you are reading it with. You need to use something that supports displaying russian and has an appropriate font.

does it look the same in flex?


CERTIFIED EXPERT
Top Expert 2016

Commented:
>>But now in xml instead of ???????? i'm getting chars like ÐÂ

Well unfortunately, i can't read that here ;-). Can you post a screenshot, showing your application window?
Commented:
If you want to be sure that your XML is well formed, and that it is in the encoding that you desire, don't hand-roll the XML text, and don't write out the XML declaration (header) by hand.  Construct a DOM tree and write that out using a solid, compliant DOM printer such as is already included in Java.  (Note that the output should be a stream, not a writer -- this lets the Transformer (DOM printer) do the encoding and ensure that it is consistent with the declaration that it writes out.)

In the long run, you'll be happy you used this approach.  There is a whole lot of mal-fomed XML out there -- make sure that you are part of the solution, not part of the problem.

final TransformerFactory fact = TransformerFactory.newInstance();
final Transformer trans = fact.newTransformer();
trans.setOutputProperty("indent", "yes");
trans.setOutputProperty("encoding", "utf-8");
final DOMSource from = new DOMSource(dom);
final StreamResult to = new StreamResult(stream);
trans.transform(from, to);

Open in new window

Not the solution you were looking for? Getting a personalized solution is easy.

Ask the Experts
Access more of Experts Exchange with a free account
Thanks for using Experts Exchange.

Create a free account to continue.

Limited access with a free account allows you to:

  • View three pieces of content (articles, solutions, posts, and videos)
  • Ask the experts questions (counted toward content limit)
  • Customize your dashboard and profile

*This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

OR

Please enter a first name

Please enter a last name

8+ characters (letters, numbers, and a symbol)

By clicking, you agree to the Terms of Use and Privacy Policy.