Invalid byte 2 of 2-byte UTF-8 sequence

Hi,

I am sorting a xml using xslt.

It worked fine when my input xml file was in UTF-8 encoding as it has russian charaters.

Now I am trying to transform it without UTF-8 as in my original scenario.
I got the sample file from the source system.
There is no encoding specified in XML prolog.

I'm gettig error:

ERROR:  'Invalid byte 2 of 2-byte UTF-8 sequence.'
ERROR:  'com.sun.org.apache.xml.internal.utils.WrappedRuntimeException: Invalid byte 2 of 2-byte UTF-8 sequence.'

I'm using simple xslt tranformation to transform in xml.


javax.xml.transform.Source xmlSource =
new javax.xml.transform.stream.StreamSource(infile);
javax.xml.transform.Source xsltSource =
new javax.xml.transform.stream.StreamSource(xsltFile);
javax.xml.transform.Result result =
new javax.xml.transform.stream.StreamResult(outfile);
 
// create an instance of TransformerFactory
javax.xml.transform.TransformerFactory transFact =
javax.xml.transform.TransformerFactory.newInstance( );
 
try{
 
javax.xml.transform.Transformer trans =
transFact.newTransformer(xsltSource);
// javax.xml.transform.Transformer trans = transFact.newTransformer();
 
trans.transform(xmlSource, result);

Open in new window

PuneetKSaxenaAsked:
Who is Participating?
 
objectsConnect With a Mentor Commented:
for the result use:

new javax.xml.transform.stream.StreamResult(new OutputStreamWriter(new FileOutputStream(outfile), encoding));

0
 
objectsCommented:
if its not utf8 (and thats the default encoding) then you need to explicitly specify what the encoding of the file is

0
 
CEHJCommented:
>>There is no encoding specified in XML prolog.

With no encoding specified anywhere, it will be attempted with the encoding that's the value of System property 'file.encoding'
0
Cloud Class® Course: Certified Penetration Testing

This CPTE Certified Penetration Testing Engineer course covers everything you need to know about becoming a Certified Penetration Testing Engineer. Career Path: Professional roles include Ethical Hackers, Security Consultants, System Administrators, and Chief Security Officers.

 
PuneetKSaxenaAuthor Commented:
how can i specify the encoding there
0
 
PuneetKSaxenaAuthor Commented:
do i need to specify the output encoding in xsl???
0
 
CEHJCommented:
First of all, you say you want to do it *without* UTF-8. How are you going to do that without UTF-8 - what encoding will you use?
0
 
PuneetKSaxenaAuthor Commented:
for russian/Spanish/german/polish/Italian/turkey what encoding to use.

Is there any common encoding which i can use in all these langauges
0
 
CEHJCommented:
The best encoding to use is ... UTF-8
0
 
objectsCommented:
utf8 is the common one. do you know what encoding the other ones are in?

0
 
objectsCommented:
to specify the encoding explicitly use:

javax.xml.transform.Source xmlSource =
new javax.xml.transform.stream.StreamSource(new InputStreamReader(new FileInputStream(infile), encoding));

same for the xsl


0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.