Solved

problems displaying chinese chars. with utf-8 encoding

Posted on 2006-07-11
16
1,725 Views
Last Modified: 2008-01-09
Hi All,

From last two days, i m having nightmare with displaying chinese characters in a web page. I have done everything, seems required. The characters are displayed properly but the html page generated is incomplete.

Please note: The html generated by the servlet engine is incomplete. Seems a problem with the JspServlet..but have no idea.

The things i did

Option 1) Use the following directives and meta tags

<%@ page pageEncoding="UTF-8"%>
<%@ page contentType="text/html;charset=UTF-8" pageEncoding="UTF-8"%>
<html>
<head>
      <meta http-equiv="Content-Type" content="text/html;charset=UTF-8" />
</head>
<body>
...etc (get chinese content from database)
</body>
</html>

Option 2) to simplify and avoid adding these lines in all jsps, i just use a Servlet filter to set the response character encoding..

response.setCharacterEncoding("UTF-8");
response.setContentType("text/html; charset=UTF-8");
response.setHeader("Content-Type","text/html; charset=UTF-8");

Apart from that, the tomcat web.xml is properly set for javaEncoding like
<init-param>
            <param-name>javaEncoding</param-name>
            <param-value>UTF8</param-value>
        </init-param>

On the backend, oracle db is used and has NLS_CHARACTERSET set to AL32UTF8.

I am not talking about uri encoding or anything on the request params. Just wish to display the page properly. Any ideas??

regards,
aks


 
0
Comment
Question by:aks143
  • 7
  • 5
  • 2
  • +1
16 Comments
 
LVL 49

Expert Comment

by:Ryan Chong
Comment Utility
this works for me to display UTF-8 Characters in Chinese in a previous application:

<%@ page language="java" pageEncoding="utf8" contentType="text/html;charset=utf-8" %>
...

<meta http-equiv="Content-Type" content="text/html; charset=utf-8">



if will not work if your data itself is not properly formatted in UTF-8 format...
0
 

Author Comment

by:aks143
Comment Utility
Hi ryancys,

Thanks for your reply. There is no difference in your solution and the one i posted. I don't see data problem, because it is displayed properly. The problem is the complete page is not rendered. I tried your solution also..no luck.
0
 
LVL 49

Expert Comment

by:Ryan Chong
Comment Utility
>>but the html page generated is incomplete.
try check:

1. make sure your scripts didn't generate any errors, try check for server's logs if necessary here.
2. your html is valid and there is no missing tags in your rendered html content.
0
 

Author Comment

by:aks143
Comment Utility
try check:

1. make sure your scripts didn't generate any errors, try check for server's logs if necessary here.
--> Checked. No errors reported to logs or elsewhere.

2. your html is valid and there is no missing tags in your rendered html content.
--> rendered html content is the problem, because the servlet engine is not generating the complete html page [Only in case the characters are chinese]. for english characters the page rendered fine and complete. Strange!!

I believe, the bytes used by the chinese are more per character and so the servlet engine is trouble displaying them.
0
 

Author Comment

by:aks143
Comment Utility
even for jsp runtime compilation, i have added -Dfile.encoding=UTF-8 in the tomcat's catalina.bat
0
 
LVL 19

Expert Comment

by:actonwang
Comment Utility
>>The characters are displayed properly but the html page generated is incomplete.

     how do you write out those chinese characters into jsp? show me code you render them out.
0
 

Author Comment

by:aks143
Comment Utility
for testing purposes, the following code. Otherwise, using struts taglibs in the view layer..

get the connection, and make a query...

ResultSet set = statement.getResultSet ();
while (set.next ())
{
      out.println(set.getString(1));
      out.println("-->");
      out.println(set.getString(2)); // here chinese chars.
}
0
How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

 

Author Comment

by:aks143
Comment Utility
actonwang. Do you see any problem with the code?

It is driving me crazy. Can someone tell me an answer for a short question:
I see that there are Chinese Simplified, Chinese Traditional character sets available, are these all covered with utf-8 ? How one can know which character are covered under which encodings?

thanks for any help.
aks
0
 
LVL 19

Expert Comment

by:actonwang
Comment Utility
>><%@ page contentType="text/html;charset=UTF-8" pageEncoding="UTF-8"%>
     these are good. I suspect that you didn't write characters out correctly. You HAVE TO write your chinese characters out using UTF8 encoding to enable them to be displayed properly using UTF-8 encoding in client side (or html file).
     that should be your problem.

Acton
0
 
LVL 19

Expert Comment

by:actonwang
Comment Utility
>>out.println(set.getString(2));

     are you using think JDBC driver?


>>On the backend, oracle db is used and has NLS_CHARACTERSET set to AL32UTF8
     It looks like a problem If you use JDBC think driver:

The JDBC Thin driver can access databases that use any of the following character sets:

    * US7ASCII (ASCII)
    * WE8ISO8859P1 (ISO-latin-1)
    * AL24UTFFSS (Unicode 1.2)
    * UTF8 (Unicode 2.0)

This happens automatically with no special action on your part.

Databases that use other character sets are not supported yet. The JDBC Thin driver can only use the US7ASCII character set for them.

     That means, in you case, you get US7ASCII code in your thin JDBC code which is why you have the problem.
0
 
LVL 19

Expert Comment

by:actonwang
Comment Utility
consider to use OCI driver if possible:

refer to this:

http://triton.towson.edu/~schmitt/java/jdbc/doc/jdbcoci3.htm
0
 

Author Comment

by:aks143
Comment Utility
Hi acton, thanks for your response. Actually first i thought..you got it. But then i had look at some articles and seems that AL32UTF8 is just like UTF-8 encoding on the database end. See the explaination here
http://www.cs.utah.edu/classes/cs5530-gary/oracle/doc/B10501_01/server.920/a96529/ch9.htm#16426

I am definitely using oracle thin driver. But looking into details it seems OCI drivers are not better performance wise.
0
 

Author Comment

by:aks143
Comment Utility
Hi all,

finally i came to know why the page was getting truncated. I am also using sitemesh and if the page is not explicitally excluded from decoration, sitemesh set the content-length wrong.

The approaches i and others mentioned, are all correct. And i prefer to get the points refunded.

thanks
aks
0
 
LVL 1

Accepted Solution

by:
GhostMod earned 0 total points
Comment Utility
Closed, 350 points refunded.

GhostMod
Community Support Moderator
0
 
LVL 19

Expert Comment

by:actonwang
Comment Utility
good to know.
0

Featured Post

Better Security Awareness With Threat Intelligence

See how one of the leading financial services organizations uses Recorded Future as part of a holistic threat intelligence program to promote security awareness and proactively and efficiently identify threats.

Join & Write a Comment

Suggested Solutions

Title # Comments Views Activity
library class in java 1 89
attribute vs parameter and setter vs add method 17 83
spring AOP 6 71
if statement not resolving in my code 5 41
Marketing can be an uncomfortable undertaking, especially if your material is technology based. Luckily, we’ve compiled some simple and (relatively) painless tips to put an end to your trepidation and start your path to success.
HOW TO: Connect to the VMware vSphere Hypervisor 6.5 (ESXi 6.5) using the vSphere (HTML5 Web) Host Client 6.5, and perform a simple configuration task of adding a new VMFS 6 datastore.
Excel styles will make formatting consistent and let you apply and change formatting faster. In this tutorial, you'll learn how to use Excel's built-in styles, how to modify styles, and how to create your own. You'll also learn how to use your custo…
This video gives you a great overview about bandwidth monitoring with SNMP and WMI with our network monitoring solution PRTG Network Monitor (https://www.paessler.com/prtg). If you're looking for how to monitor bandwidth using netflow or packet s…

744 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

18 Experts available now in Live!

Get 1:1 Help Now