In case links aren't working -- the suggestion is to try character encoding ISO-8859-1.
Main Topics
Browse All TopicsI have this issue lingering for a while. I have an xml which has special characters and I am trying to parse them and have serious problem. Experts please advice.
Here is the xml
<?xml version="1.0" encoding="UTF-8"?>
<user_data>
<time_taken>ÀÀÀÀ</time_tak
</user_data>
Here is my servlet which parses:
protected ModelAndView handleRequestInternal(Http
{
request.setCharacterEncodi
int contentLength = request.getContentLength()
if ( contentLength == -1 ) {
// Content length must be known.
throw new ServletException( "Content-Length must be specified" );
}
String contentType = request.getContentType();
System.out.println("reques
System.out.println("reques
boolean contentTypeIsOkay = false;
// Content-Type must be specified.
if ( contentType != null ) {
// The type must be plain text.
if ( contentType.startsWith( "text/xml" ) ) {
// And it must be UTF-8 encoded (or unspecified, in which case
// we assume
// that it's either UTF-8 or ASCII).
if ( contentType.indexOf( "charset=" ) == -1 ) {
contentTypeIsOkay = true;
} else if ( contentType.indexOf( "charset=utf-8" ) != -1 ) {
contentTypeIsOkay = true;
}
}
}
if ( !contentTypeIsOkay ) {
throw new ServletException(
"Content-Type must be 'text/xml' with 'charset=utf-8' (or unspecified charset)" );
}
InputStream in = request.getInputStream();
// InputStreamReader in = new InputStreamReader(request.
String decoded = null;
String pay = null;
try {
byte[] payload = new byte[contentLength];
int offset = 0;
int len = contentLength;
int byteCount;
while ( offset < contentLength ) {
byteCount = in.read( payload, offset, len );
if ( byteCount == -1 ) {
throw new ServletException( "Client did not send " + contentLength + " bytes as expected" );
}
offset += byteCount;
len -= byteCount;
}
pay = new String( payload, "UTF-8" );
System.out.println("xml is : " +pay );
decoded = URLDecoder.decode(pay, "utf-8");
System.out.println("decode
} finally {
if ( in != null ) {
in.close();
}
}
sun.io.ByteToCharConverter
String convertedStr = decoded;
try {
fromUnicode = sun.io.ByteToCharConverter
fromUnicode.setSubstitutio
char[] convertedChars;
convertedChars = fromUnicode.convertAll(con
convertedStr = new String(convertedChars);
System.out.println("conver
} catch (UnsupportedEncodingExcept
e.printStackTrace();
}
InputStream inputStream = request.getInputStream();
System.out.println("reques
SAXBuilder builder = null;
// Create an instance of the tester and test
builder = new SAXBuilder();
Document doc= builder.build(new java.io.ByteArrayInputStre
//////ERROR : Illegal XML character: .
Element user_data =doc.getRootElement();
This Question has been solved and asker verified All Experts Exchange premium technology solutions are available to subscription members.
Experts Exchange has been collecting answers to technology questions since 1996…3 million and counting! If you have a question, chances are we already have your answer.
If you can't find the exact answer you're looking for, ask our exclusive community of 50,000 experts. You’ll get a personalized answer from a trusted professional.
Thousands of free tech tips, tricks, how-to’s and tutorials are available in our peer reviewed articles section. See for yourself how smart our experts are, no login required.
Access the answers to your technology questions today.
30-day free trial. Register in 60 seconds.
Members of the expert community talk about why the experience at Experts Exchange is different than what you will find anywhere else.

Try it out and discover for yourself.
30-day free trial. Register in 60 seconds.
Join the community of experts here and help other tech pros by answering question in your area of expertise. You can earn FREE access to all Experts Exchange's premium features and resources.
Business Accounts
Answer for Membership
by: mwvisa1Posted on 2008-11-03 at 04:00:26ID: 22866278
See if these help:
rvices/new sletter/ xm lgeneratio n.html
m/cgi-bin/ ubb/ultima tebb.cgi? u bb=get_top ic&f=34&t= 003647
om/topic/T opicAction .do?Id=96
Deals with character encoding to handle accented characters like you are using:
http://www.javazoom.net/se
Dealing with unicode characters:
http://saloon.javaranch.co
Escaping other characters reference:
http://www.javapractices.c
Hopefully that helps. Think the first link will be what you are looking for and the others are for light reading.