How to re encode UTF-8 to russian or something else

Posted on 2004-10-29
Medium Priority
Last Modified: 2008-01-09

My problem is following!
I get an UTF-8 String, which was typed in in russian from following jsp:

<%@page contentType="text/html"%>
<%@page pageEncoding="UTF-8" %>
<%@page language="java"%>

In the called jsp i tried to split this up with byte[] bytessplitted = abvaluea.getBytes( "UTF-8" )

Called JSP:

<%@page contentType="text/html"%>
<%@page pageEncoding="UTF-8" %>
<%@page language="java"%>

i tried russian:

I entered: &#1092;&#1099;&#1074;&#1072;
Convert to bytes: from HTTP (UTF-8) splitted with byte[] bytessplitted = abvaluea.getBytes( "UTF-8" );
ReEncode: <%= "UTF8 to Russian Cp1251:"+new String(bytess, "Cp1251")+"<br>"%><%
%><%= "UTF8 to Russian ISO8859_5:"+new String(bytess, "ISO8859_5")+"<br>"%><%
%><%= "UTF8 to Russian Cp1025:"+new String(bytess, "Cp1025")+"<br>"%><%
%><%= "UTF8 to Russian Cp855:"+new String(bytess, "Cp855")+"<br>"%><%
%><%= "UTF8 to Russian Cp866:"+new String(bytess, "Cp866")+"<br>"%><%
%><%= "UTF8 to Russian KOI8_R:"+new String(bytess, "KOI8_R")+"<br>"%><%

result on the new jsp:
UTF8 to Russian Cp1251:&#1043;‘&#1042;„&#1043;‘&#1042;‹&#1043;&#1106;&#1042;&#1030;&#1043;&#1106;&#1042;°
UTF8 to Russian ISO8859_5:&#1059;‘&#1058;„&#1059;‘&#1058;‹&#1059;&#1058;&#1042;&#1059;&#1058;&#1040;
UTF8 to Russian Cp1025:CjBdCjB&#1077;C&#1081;B&#1079;C&#1081;B&#1100;
UTF8 to Russian Cp855:&#9500;&#1033;&#9516;&#1105;&#9500;&#1033;&#9516;&#1030;&#9500;&#1113;&#9516;&#9619;&#9500;&#1113;&#9516;&#9617;
UTF8 to Russian Cp866:&#9500;&#1057;&#9516;&#1044;&#9500;&#1057;&#9516;&#1051;&#9500;&#1056;&#9516;&#9619;&#9500;&#1056;&#9516;&#9617;
UTF8 to Russian KOI8_R:&#1094;&#9618;&#1073;&#9492;&#1094;&#9618;&#1073;&#9600;&#1094;&#9617;&#1073;&#9569;&#1094;&#9617;&#1073;&#9567;

correct display of the characters:

when i Used it with an German charakter e.g. "string" then it works!!
So when i want to display this now in a jsp i tried following:
<%= "UTF8 to German:"+new String(bytessplitted , "ISO8859_1")+"<br>"%>

result is:

So this is fine

Is my russian usage false?

Please help me!

Question by:gramesg
  • 4
  • 3

Expert Comment

ID: 12462056
it is not a safe assumption that the characters are coming to you in UTF-8, in fact thay are likely not to be. This code seems to work for windows on ie and mozilla. I would have thought you could convert the chars to utf-16 at least but i suspect the could be bugs in tomcat (which I am testing on) and/or the browsers.

<%@ page pageEncoding="UTF-8" %>
String charset_in="KOI8_R";String charset_out="KOI8_R";
response.setHeader("Content-Type","text/html; charset="+charset_out); %>
<TITLE>Form page</TITLE>
<meta http-equiv="Content-Type" content="text/html;charset=<%= charset_out %>" >
String param=request.getParameter("data");
if (param!=null)
param=new String(param.getBytes(),charset_in);
<%= param %>
<form action="./russian_encode.jsp" method="get" >
<input type="text" name="data" value="<%= param %>"/>
<input  type="submit"  value="GO"/>  

Author Comment

ID: 12470667

Thank you for your fast response!

I tried it out now and it is working, but only when i set in my browser the coding to cyrillic by hand!
Because before i do this it was set to Automatic and west eruopean -> and the result was only that i see false signs!

How can i do this via the source.

My second point is now i want to display the a static text in the same jsp page in a different language (arabic for example)
Is this possible??

Thanks Gernot

Expert Comment

ID: 12471078
yeah i'm not sure it showed up automatically on mine so I'm not sure why you had to change, I'm not sure wheather there are bugs in the browsers but it didnt seem to work first time but only after i typed russian characters in. not sure why that is but i imagine that all russian s will have set their keyboards set for russian so should show up staright away. &#10;&#10;since this doesn't use utf-8 or utf-16 the only option seems to be to embed your static text of different character sets in iframes so that you can change the character set for the document in each iframe.
Free Tool: Port Scanner

Check which ports are open to the outside world. Helps make sure that your firewall rules are working as intended.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.


Author Comment

ID: 12471438
Thanks that was it!

nice day!

greets from Austria!

Expert Comment

ID: 12481480
no probs - mabe you could assign you points if you are happy.

Author Comment

ID: 12481493
How can i give you now the points!!

Sorry i am new!

Accepted Solution

siliconeagle earned 500 total points
ID: 12481553
hmmm, not sure as i havent posted a question. maybbe there should be "accept answer" button or something?? I'm guessing when you are logged in, as the asker there should be some button to indicate the question is resolved.

Featured Post

The 14th Annual Expert Award Winners

The results are in! Meet the top members of our 2017 Expert Awards. Congratulations to all who qualified!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

I tried to use the SharePoint app to Import a Spreadsheet and import an Excel sheet into a Team site made in SharePoint 2016. But that just resulted in getting an error message 'Unknown Error'...
Washington based MSP turned to OnPage to solve their needs for after-hours alerting when customers’ technologies failed. In this post  see how DNS benefited from rolling out OnPage as a solution: -Preserve their SLAs -Improve response time by…
How can you see what you are working on when you want to see it while you to save a copy? Add a "Save As" icon to the Quick Access Toolbar, or QAT. That way, when you save a copy of a query, form, report, or other object you are modifying, you…
Stellar Phoenix SQL Database Repair software easily fixes the suspect mode issue of SQL Server database. It is a simple process to bring the database from suspect mode to normal mode. Check out the video and fix the SQL database suspect mode problem.

593 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question