Link to home
Start Free TrialLog in
Avatar of gravindrababu
gravindrababu

asked on

UniCode to SJIS

How do we get the SJIS equivalent value of a Unicode character in Java ???
Avatar of sgoms
sgoms

String myString = "<initialize to the string u want to";
bytes[] SJISbytes=myString.getBytes("SJIS"); //specify the encoding

//get ur SJIS string using,

String SJISString=new String(SJISBytes,"SJIS");

-sgoms
String myString = "<initialize to the string u want to";
bytes[] SJISbytes=myString.getBytes("SJIS"); //specify the encoding

//get ur SJIS string using,

String SJISString=new String(SJISBytes,"SJIS");

-sgoms
Avatar of gravindrababu

ASKER

sgoms, I have done upto that. But what I actually need is 'int' value of the SJIS code. So, if I typecast each character of the resulted SJISString to int, it should result me the int equivalent of the SJIS code of the particular character. But it is still  giving me the Unicode Value only. So, can u please suggest me on this ?
Remember that the length of any conversion is not necessarily the same as the length of the source. For example, when converting the SJIS encoding to Unicode, sometimes one byte will convert into a single Unicode character, and sometimes two bytes will.

-sgoms
can you post ur code so that we can take it from there.

-sgoms
sgoms, thanx for the reply. Please go thru the code ...

import java.util.StringTokenizer;
import java.io.*;

public class KanjiRangeCheck
{
      public static void kanjiRangeCheck(String str)
      {

            try
            {
            byte[] byteArray = str.getBytes();
            String strSJISString = new String(byteArray,"SJIS");
            for(int i = 0;i < strSJISString.length(); i++)
            {
                int iChar = (int)strSJISString.charAt(i);
                System.out.println("Int Equivalent of SJIS Char "+iChar);
                  }
              }
            catch(Exception e )
            {
                e.printStackTrace();
                return false;
          }
      }
    public static void main(String[] args)
    {
        if(args.length == 0)
            {
                  System.err.println("Usage : \n" +
              "java KanjiRangeCheck Kanji  ");
                  return;
          }
        String sKanji = String.valueOf(args[0]);
        KanjiRangeCheck.kanjiRangeCheck(sKanji);
    }

}
what about

byte[] byteArray = str.getBytes("SJIS");
for(int i = 0;i < byteArray.length; i++)
{
  System.out.println("Int Equivalent of SJIS Char "+ byteArray[i]);
}
               
ASKER CERTIFIED SOLUTION
Avatar of sgoms
sgoms

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Thanx sgoms, I got the problem and the solution u have provided solved my problem. Actually, I was trying different ways to convert the byte to Hexa value, but I could not succeed. I am not familiar with bitwise operations and am trying to understand ur code. If possible, could u please explain me the bit-wise operations ur doing.
lets consider a byte value of 63.
its binary representation is
0011 1111

if we convert it to hex it'll be

0011 1111
-----  ------
  3       f

so the hex value is 0x3f

that's manual conversion. programatically we need to fetch the higher order 4 bits (i.e 0011) & get its eqivalent hex number.

so by shifting the byte by 4 bits what we are doing is,

0011 1111 >> 4 = 0000 0011

i.e u remove the lower order 4 bits out of the scene.
>> is signed shifting. so in case ur byte has negative value say -63 then it'll be something like,
1100 0001
when you >> 4 u'll get
1111 1100
you'll notice that the higher order bits r filled with 1's instead of 0's. that's because >> is a signed shift & it carries the sign when its shifted. it'll left with the trailing bit.

ok..once u've shifted the bits, inorder to fetch the lower order bits alone you and t with 0000 1111 (0x0f)

so u'll get
0000 0011(&)
0000 1111
------------
0000 0011(3)

in case ur number is -ve u'll remove the higer order 1s
1111 1100(&)
0000 1111
------------
0000 1100

so now uve got 3 as the fst value

the same way fetch the higher order bits by simply anding with f0

0011 1111(&)
0000 1111
------------
0000 1111(15)

this will get u 15 which shud be fetched as f from the array.

so no u've gor 3 & f
concatenate it to get
0x3f.

-sgoms
Thanx sgoms. Understood the stuff.
great
-sgoms