• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 462
  • Last Modified:

Unicode String with format &#code; and format \ucode

Hello
I would like to know how Java process a String unicode
- with format &#code;
- with format \ucode
there are any difference ?

In a test1.jsp file, I have 2 String unicode
String str1 = "\u041D\u0435\u043F\u0440\u0430\u0432\u0438\u043B\u044C\u043D\u044B\u0439 \u043A\u043E\u0434 \u0434\u043E\u0441\u0442\u0443\u043F\u0430!"

String str2 = "Неправильный код доступа!"

(Normally, There are a same value "Неправильный код доступа!" in Russe)

in test1.jsp, I forward to test2.jsp :

<jsp:forward page="test2.jsp" >
      <jsp:param name="str1" value="<%= str1%>" />
      <jsp:param name="str2" value="<%= str2%>" />
</jsp:forward>


at test2.jsp, I get 2 para str1 and str2 and display them.
Only str2 is displayed well. The str1 is displayed "???????????? ??? ???????!"

I have to use a unicode String \ucode but I dont know how to fix this problem ?
thanks a lot
best regards
ndhai
0
ndhai
Asked:
ndhai
  • 13
  • 13
  • 6
  • +1
1 Solution
 
CEHJCommented:
>>&#1072;

These are decimal entities
0
 
ndhaiAuthor Commented:
I dont know why the Russe Text doesn't display well
=>>> There a a same value "&#1053;&#1077;&#1087;&#1088;&#1072;&#1074;&#1080;&#1083;&#1100;&#1085;&#1099;&#1081; &#1082;&#1086;&#1076; &#1076;&#1086;&#1089;&#1090;&#1091;&#1087;&#1072;!"
0
 
CEHJCommented:
A font that can display te characters must be installed
0
Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
ndhaiAuthor Commented:
@CEHJ  : I dont understand what you mean ?
Someone can tell me how can I display Russe text well on this site ?
thanks
0
 
CEHJCommented:
You need to set the response encoding to UTF-8 and the viewing browser must have a font that can display Russian characters
0
 
objectsCommented:
try passing the string as a request atribute instead of a request parameter
0
 
ndhaiAuthor Commented:
@CEHJ : I ready set the response encoding to UTF-8 (If not, how the str2 &#code; is displayed well):
  - in test1.jsp : response.setContentType("text/html,utf-8");
  - in test2.jsp : <%@ page contentType = "text/html;charset=utf-8" %>

@objects : do you mean that
- I set request.setAttribute("sMsg", sMsg); in test1.jsp
- I get (String)request.getAttribute("str1"); in test2.jsp

I dont want to do that because if I do that, then I have to change a lot of Java codes (I dont want to touch (modify) the code.

thanks a lot
ndhai
0
 
CEHJCommented:
0
 
objectsCommented:
> - I set request.setAttribute("sMsg", sMsg); in test1.jsp
> - I get (String)request.getAttribute("str1"); in test2.jsp

yes, but should be:

- I set request.setAttribute("sMsg", sMsg); in test1.jsp
- I get (String)request.getAttribute("sMsg"); in test2.jsp

> I dont want to do that because if I do that, then I have to change a lot of Java codes (I dont want to touch (modify) the code.

you're going to have to change something :)
worth trying at least to see if it fixes it
0
 
ndhaiAuthor Commented:
sorry I typed wrong. I mean str1 in 2 codes lines!
yes, It works :)

but you are sure we haven't another solution ?
0
 
objectsCommented:
> but you are sure we haven't another solution ?

another would be to convert the string. Something like:

     <jsp:param name="str2" value="<%= Utils.encode(str2) %>" />
0
 
CEHJCommented:
What does

request.getCharacterEncoding()

in test2.jsp print?
0
 
ndhaiAuthor Commented:
@objects : I dont understand what you mean here Utils.encode ? It is a function what concerts \ucode to &#code; or a standart function of Java ?

@CEHJ: printed 'null' !
Is there some wrong here ? I ready set <%@ page contentType = "text/html;charset=utf-8" %> on the top of test2.jsp
0
 
objectsCommented:
that was just an example if you wanted to write your own conversion utility

passing it as a request attribute is a better way to pass it regardless imo, and code changes are minimal.
0
 
ndhaiAuthor Commented:
if I add the code request.setCharacterEncoding("utf-8");, it prints "utf-8" but str2 is always displayed ?????!
0
 
CEHJCommented:
In that case, try

request.setCharacterEncoding("UTF-8") ;

before using the request
0
 
ndhaiAuthor Commented:
nothing is changed! it displays ????
code request.getCharacterEncoding()  prints 'UTF-8'

in test2.jsp I ready have :
- on the top : <%@ page contentType = "text/html;charset=UTF-8" %>
- fisrt code Java : request.setCharacterEncoding("UTF-8");
- in meta tag HTML: <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/>


0
 
objectsCommented:
that all looks fine.  why don't you want to pass it as an attrbute?  passing as parameter is more intended for parameters being passed in externally.
0
 
ndhaiAuthor Commented:
@objects: If we haven't another solution, then I will use set and getAttribute.
but I hope there is another one!
0
 
CEHJCommented:
Just as a matter of interest, try

String s = request.getParameter("x"); // (Whatever)
out.println(new String(s.getBytes(), "UTF-8"));
0
 
ndhaiAuthor Commented:
it printed "??????" :(

CEHJ:  do you think there is some thing wrong in my codes ? or this is process way of Java for unicode String ? could you test it your self ?
thanks in advance
ndhai
0
 
CEHJCommented:
>>CEHJ:  do you think there is some thing wrong in my codes ?

At the moment i'd guess not. Hardcode that String in test2.jsp and tell me what you see
0
 
ndhaiAuthor Commented:
If I set str2 = "\u041D\u0435\u043F\u0440\u0430\u0432\u0438\u043B\u044C\u043D\u044B\u0439 \u043A\u043E\u0434 \u0434\u043E\u0441\u0442\u0443\u043F\u0430!";
and print it
>> output like that : http://nguyenduchai.free.fr/temp/msgRusse.JPG

0
 
objectsCommented:
yes, thats what I'd expect, its displaying it fine.
Its the passing as a parameter thats corrupting it.

0
 
WebstormCommented:
Hi ndhai,

The main difference between the 2 encodings is :
   \u<hexa>        is only understood by Java which translate it directly to the corresponding character, which may not be understood by your web browser, unless changing character encoding to unicode (UTF-8, UTF-16, ...)

   &#<deci>;
   &#x<hexa>;       are only understood by your HTML browser.
0
 
CEHJCommented:
*Are* you  passing the parameter as entities rather than Unicode escaped ndhai?
0
 
ndhaiAuthor Commented:
@CEHJ : I have to using the unicode string like that (\ucode, not &#code;). In fact, this value is gotten from a Ressource file (BundleRessource)!
0
 
CEHJCommented:
Just call this in a loop then use it as the parameter:


      public static String toHexEntity(char c) {
            StringBuilder sb = new StringBuilder(7);
            return sb.append("&#x").append(Integer.toHexString(c)).append(';')
                        .toString();
      }
0
 
CEHJCommented:
public static String toHexEntities(char[] chars) {
            StringBuffer sb = new StringBuffer(chars.length * 8);
            for (int i = 0; i < chars.length; i++) {
                  sb.append(toHexEntity(chars[i]));
            }
            return sb.toString();
}
0
 
ndhaiAuthor Commented:
@CEHJ: I have a question, just a doubt
when you talk
"In that case, try
request.setCharacterEncoding("UTF-8") ;
before using the request"

it is test1.jsp or test2.jsp or both of them ?

thanks
0
 
CEHJCommented:
It won't harm to do it in both unless you have something exotic coming into test1
0
 
ndhaiAuthor Commented:
the key of response : we have to set request.setCharacterEncoding("UTF-8") ; before using it!
thanks CEHJ and everybody
regards
ndhai
0
 
CEHJCommented:
:-)
0

Featured Post

Concerto Cloud for Software Providers & ISVs

Can Concerto Cloud Services help you focus on evolving your application offerings, while delivering the best cloud experience to your customers? From DevOps to revenue models and customer support, the answer is yes!

Learn how Concerto can help you.

  • 13
  • 13
  • 6
  • +1
Tackle projects and never again get stuck behind a technical roadblock.
Join Now