Solved

utf 8 encoding problem in Java  in Transformer class

Posted on 2015-02-18
9
130 Views
Last Modified: 2015-03-13
I'm getting some UTF-8 xml which I am processing (Removing some nodes) and then writing out again to another xml file
However some of the UTF-8 characters (French letters) are screwed up

Is there a way around it ?

I don't need to use Transformer class

	File origFile = new File(dataFile.getCanonicalFile() + ".orig");
                File origFilex = new File(dataFile.getCanonicalFile() + ".orig1x");  
                
            
		dataFile.renameTo(origFile);
                
                DocumentBuilder dBuilder = DocumentBuilderFactory.newInstance().newDocumentBuilder();
                Document doc = dBuilder.parse(origFile);
                              
                modifyxml(doc,"Contributor");
                modifyxml(doc,"Author");
                
                TransformerFactory transformerFactory = TransformerFactory.newInstance();
                Transformer transformer = transformerFactory.newTransformer();
                transformer.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
                DOMSource source = new DOMSource(doc);
                StreamResult result = new StreamResult(origFilex.getPath());
                transformer.transform(source, result);

Open in new window

0
Comment
Question by:sniger
  • 4
  • 3
  • 2
9 Comments
 
LVL 37

Accepted Solution

by:
zzynx earned 500 total points
ID: 40618394
Does this help?

        FileInputStream in = new FileInputStream(origFile);
        Document doc = dBuilder.parse(in, "UTF-8");

Open in new window


So in fact, replacing
 Document doc = dBuilder.parse(origFile);

Open in new window

by
Document doc = dBuilder.parse(new FileInputStream(origFile), "UTF-8");

Open in new window

0
 
LVL 86

Expert Comment

by:CEHJ
ID: 40618579
Use an InputSource instead

Document doc = dBuilder.parse(new InputSource(new InputStreamReader(in, "UTF-8")));

Open in new window


To serialize the result:

http://technojeeves.com/index.php/aliasjava1/96-serialize-xml-to-file-in-java
0
 

Author Comment

by:sniger
ID: 40618858
unfortunately it did not
0
Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 86

Expert Comment

by:CEHJ
ID: 40618880
If my code doesn't (obviously you'll need to ensure that you write UTF-8 too if that's appropriate) then please attach an input file that is problematic
0
 

Author Comment

by:sniger
ID: 40618888
  <FullName LanguageAndScriptCode="en">Carlos  Fauré</FullName> 

Open in new window

It gets converted to:

</ResourceContributor>
                <ResourceContributor SequenceNumber="2">
                    <PartyName LanguageAndScriptCode="en">
                        <FullName LanguageAndScriptCode="en"> Carlos  Fauré</FullName>
                    </PartyName>
                    <PartyId>8293</PartyId>
                    <ResourceContributorRole Namespace="PA-DP-2007032-I" UserDefinedValue="Composer">UserDefined</ResourceContributorRole>
                </ResourceContributor>

Open in new window

0
 
LVL 37

Expert Comment

by:zzynx
ID: 40618929
Maybe you should post your complete code (or a simplified version) so that we can run it.
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 40619007
then please attach an input file
Quoting from one won't help. Of course, my code will only fix the problem if your input actually is encoded as UTF-8. Otherwise the actual encoding should be specified instead.
0
 
LVL 37

Expert Comment

by:zzynx
ID: 40662789
Thanx 4 axxepting.
However, some explanation about why you do close the question as you do is always welcome.
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 40662861
I too would welcome an explanation, especially since i'm almost certain the accepted comment would not have helped ;)
0

Featured Post

Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
Running JavaFX on JDeveloper 12C 1 76
jsp insert to database example 2 61
Java Eclipse Loop 3 31
How to fix  socket closed error 11 27
For beginner Java programmers or at least those new to the Eclipse IDE, the following tutorial will show some (four) ways in which you can import your Java projects to your Eclipse workbench. Introduction While learning Java can be done with…
Introduction This article is the second of three articles that explain why and how the Experts Exchange QA Team does test automation for our web site. This article covers the basic installation and configuration of the test automation tools used by…
Viewers learn about the third conditional statement “else if” and use it in an example program. Then additional information about conditional statements is provided, covering the topic thoroughly. Viewers learn about the third conditional statement …
Viewers will learn about basic arrays, how to declare them, and how to use them. Introduction and definition: Declare an array and cover the syntax of declaring them: Initialize every index in the created array: Example/Features of a basic arr…

749 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question