Link to home
Start Free TrialLog in
Avatar of gramesg
gramesg

asked on

How to re encode UTF-8 to russian or something else

Hi,

My problem is following!
I get an UTF-8 String, which was typed in in russian from following jsp:

<%@page contentType="text/html"%>
<%@page pageEncoding="UTF-8" %>
<%@page language="java"%>
...

In the called jsp i tried to split this up with byte[] bytessplitted = abvaluea.getBytes( "UTF-8" )

Called JSP:

<%@page contentType="text/html"%>
<%@page pageEncoding="UTF-8" %>
<%@page language="java"%>
...

i tried russian:

I entered: &#1092;&#1099;&#1074;&#1072;
Convert to bytes: from HTTP (UTF-8) splitted with byte[] bytessplitted = abvaluea.getBytes( "UTF-8" );
ReEncode: <%= "UTF8 to Russian Cp1251:"+new String(bytess, "Cp1251")+"<br>"%><%
%><%= "UTF8 to Russian ISO8859_5:"+new String(bytess, "ISO8859_5")+"<br>"%><%
%><%= "UTF8 to Russian Cp1025:"+new String(bytess, "Cp1025")+"<br>"%><%
%><%= "UTF8 to Russian Cp855:"+new String(bytess, "Cp855")+"<br>"%><%
%><%= "UTF8 to Russian Cp866:"+new String(bytess, "Cp866")+"<br>"%><%
%><%= "UTF8 to Russian KOI8_R:"+new String(bytess, "KOI8_R")+"<br>"%><%

result on the new jsp:
UTF8 to Russian Cp1251:&#1043;‘&#1042;„&#1043;‘&#1042;‹&#1043;&#1106;&#1042;&#1030;&#1043;&#1106;&#1042;°
UTF8 to Russian ISO8859_5:&#1059;‘&#1058;„&#1059;‘&#1058;‹&#1059;&#1058;&#1042;&#1059;&#1058;&#1040;
UTF8 to Russian Cp1025:CjBdCjB&#1077;C&#1081;B&#1079;C&#1081;B&#1100;
UTF8 to Russian Cp855:&#9500;&#1033;&#9516;&#1105;&#9500;&#1033;&#9516;&#1030;&#9500;&#1113;&#9516;&#9619;&#9500;&#1113;&#9516;&#9617;
UTF8 to Russian Cp866:&#9500;&#1057;&#9516;&#1044;&#9500;&#1057;&#9516;&#1051;&#9500;&#1056;&#9516;&#9619;&#9500;&#1056;&#9516;&#9617;
UTF8 to Russian KOI8_R:&#1094;&#9618;&#1073;&#9492;&#1094;&#9618;&#1073;&#9600;&#1094;&#9617;&#1073;&#9569;&#1094;&#9617;&#1073;&#9567;

correct display of the characters:
http://www.codeguru.com/forum/showthread.php?t=316035

when i Used it with an German charakter e.g. "string" then it works!!
So when i want to display this now in a jsp i tried following:
<%= "UTF8 to German:"+new String(bytessplitted , "ISO8859_1")+"<br>"%>

result is:
string

So this is fine


Is my russian usage false?

Please help me!

Gernot
Avatar of siliconeagle
siliconeagle

it is not a safe assumption that the characters are coming to you in UTF-8, in fact thay are likely not to be. This code seems to work for windows on ie and mozilla. I would have thought you could convert the chars to utf-16 at least but i suspect the could be bugs in tomcat (which I am testing on) and/or the browsers.

<%@ page pageEncoding="UTF-8" %>
<%
String charset_in="KOI8_R";String charset_out="KOI8_R";
response.setHeader("Content-Type","text/html; charset="+charset_out); %>
<HTML>
<HEAD>
<TITLE>Form page</TITLE>
<meta http-equiv="Content-Type" content="text/html;charset=<%= charset_out %>" >
</head>
<body>
<%
String param=request.getParameter("data");
if (param!=null)
param=new String(param.getBytes(),charset_in);
%>
<%= param %>
<hr/>
<form action="./russian_encode.jsp" method="get" >
<input type="text" name="data" value="<%= param %>"/>
<input  type="submit"  value="GO"/>  
</form>  
</body>
</html>
Avatar of gramesg

ASKER

Hi,

Thank you for your fast response!

I tried it out now and it is working, but only when i set in my browser the coding to cyrillic by hand!
Because before i do this it was set to Automatic and west eruopean -> and the result was only that i see false signs!

How can i do this via the source.

My second point is now i want to display the a static text in the same jsp page in a different language (arabic for example)
Is this possible??

Thanks Gernot
yeah i'm not sure it showed up automatically on mine so I'm not sure why you had to change, I'm not sure wheather there are bugs in the browsers but it didnt seem to work first time but only after i typed russian characters in. not sure why that is but i imagine that all russian s will have set their keyboards set for russian so should show up staright away. &#10;&#10;since this doesn't use utf-8 or utf-16 the only option seems to be to embed your static text of different character sets in iframes so that you can change the character set for the document in each iframe.
Avatar of gramesg

ASKER

Thanks that was it!

nice day!

greets from Austria!
no probs - mabe you could assign you points if you are happy.
Avatar of gramesg

ASKER

How can i give you now the points!!

Sorry i am new!
ASKER CERTIFIED SOLUTION
Avatar of siliconeagle
siliconeagle

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial