Solved

Issue to store Chinese character to Oracle9i

Posted on 2006-07-23
12
3,052 Views
Last Modified: 2011-04-14
Hi Friends,

I have a question regarding to storing chinese characters in oracle 9i (rel 2) db, I use unicode data type to implement it instead of a unicode database.

My current db settings for NLS are:
         PARAMETER      VALUE
1      NLS_CHARACTERSET      WE8MSWIN1252
2      NLS_NCHAR_CHARACTERSET      AL16UTF16

My Client setting for NLS_LANG (system environment variables)
SIMPLIFIED CHINESE_CHINA.ZHS16GBK

My operating system is Win2000 and its default locale is Chinese.

I can see chinese characters in SQLPLUS:
In SQLPLUS, I create a table
create table test (mynvarchar2  nvarchar2(10));
then
insert into test values ('[some chinese chars]');
then
select * from test;

It shows ???...

Any advice on the above ?

Best Regards,

Mike Zhou
 

0
Comment
Question by:mikezhou
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
12 Comments
 
LVL 14

Expert Comment

by:sathyagiri
ID: 17165607
Try updating the registry on the client m/c under software->oracle->NLS_LANG variable and see if that works
0
 

Author Comment

by:mikezhou
ID: 17165691
I have tried that (understand that system envrionment variable overwrite registry value) and the result still the same...
0
 
LVL 11

Expert Comment

by:mohammadzahid
ID: 17166093
Question to your answer is in Oracle9i gobalization Guide. Listed below is the link. Please take a look.  

http://download-east.oracle.com/docs/cd/B10500_01/server.920/a96529/toc.htm

0
PeopleSoft Has Never Been Easier

PeopleSoft Adoption Made Smooth & Simple!

On-The-Job Training Is made Intuitive & Easy With WalkMe's On-Screen Guidance Tool.  Claim Your Free WalkMe Account Now

 

Author Comment

by:mikezhou
ID: 17166221
The doc is the exact document I am referencing before trying the above. I feel I follow all the steps stated in the doc...however I can not get the result as expected...I suspect something not obvious missing in my steps...
0
 
LVL 48

Expert Comment

by:schwertner
ID: 17166810
To store Chinese letters your DB has to be created with AL32UTF8 character set.
Or with some of the specific chinese character sets, but WE8MSWIN1252
doesn't support chinese (so far!)
0
 

Author Comment

by:mikezhou
ID: 17181933
I want a solution 'use unicode data type to implement it instead of a unicode database.' as just a very small fraction of my application will use Chinese, Japanese etc.
0
 
LVL 48

Expert Comment

by:schwertner
ID: 17182001
Sorry, this is not possible as far as I am informed!
The character set of a DB is defined at the beginning and cannot be changed.
It has global influence on the created DB and there couldn't be pieces of the DB
using another character sets. Of course applications can use different character sets, but
they should be subsets of the character set of the DB.

For your case I will suggest to create another DB with Unicode. If you have license isuues use
Oracle XE (free!, but only 4 GB user data).
0
 

Author Comment

by:mikezhou
ID: 17182017
Oracle provide this by the solution of unicode data types NCHAR, NVARCHAR2 and national character set.
0
 
LVL 48

Expert Comment

by:schwertner
ID: 17182031
A workaround is to do schema export (only non-systen users), to create another DB with AL32UTF8 character set and to import the schemas there. You can do the experiment running the existing DB and if you are satisfied to do this for production.
The only issue is to investigate all non-system schemas and tablespaces (they need precreation before Import), but sometimes this is a good new knowledge and also can improve the DB. Also a trap could be the value of NLS_LANG registry entry - it dictates the character set of Export and Import utilities.
0
 
LVL 48

Accepted Solution

by:
schwertner earned 250 total points
ID: 17182081
I'm inserting <special character> in a Nchar or Nvarchar2 col but it comes back as ? or ¿ ...

    All the data "embedded" in sql is first converted to the database characterset
    before being parsed, this means that if you have a (for example) a WE8IS8859P1
    database with an UTF8 *national* characterset, you will not be able to insert
    anything else than characters known by WE8ISO8859P1 into Nchar or Nvarchar2 fields.
     
    This does NOT mean that such a database is unable to *store* unicode in an
    N-type field, this is just a (pls)SQL parser restriction.

    Selecting existing data from N-type fields will work fine, it's only a
    problem with the insert.

    The workaround is here to:
    * use sqlldr to load the text from flat files (see point 18)
    * use export / import
    * use a Pro*C application or other language (it a (pls)SQL restriction)
    * use NCHAR_CS "insert into test values (chr(14844588 USING NCHAR_CS));"
    * use (from 9i onwards)
      - the Nchar: "insert into test values (nchr(50050));"
      - the UNISTR function: "insert into test values (UNISTR('\20AC'));"

    This limitation is lifted in 10g R2, under the condition that you prefix your
    NCHAR literals with N 'string literal' and you set the client side parameter
    ORA_NCHAR_LITERAL_REPLACE to TRUE (default=false)
    Then the strings will be preserved in 10g R2, note that you need a
    10g R2 server *and* client.

    This will NOT make a windows sqlplusw.exe a unicode application,
    example: you trying to insert chinese BIG5 in a WE8MSWIN1252 NLS_CHARACTERSET
    / AL16UTF16 NLS_NCHAR_CHARACTERSET database in NCHAR field using
    sqlplusw.exe .

    If you are using a West european windows system (ACP 1252) then you need
    to use a NLS_LANG set to WE8MSWIN1252, even when using N literals you will
    not be able to store chinese (that you copy from a unicode application like
    Microsoft Word for example) because sqlplus does not know chinese,
    it's a ANSI application.

    If you change the LOCALE (ACP) from the client to 963 (ZHS16GBK) and use a NLS_LANG
    set to ZHS16GBK then you will be able to insert chinese in a NCHAR because
    the *client* now knows how to handle chinese trough ZHS16GBK...
    The ORA_NCHAR_LITERAL_REPLACE = True will avoid the ZHS16GBK data coming
    from the Chinese windows client to be first converted to the WE8MSWIN1252
    NLS_CHARACTERSET, nothing more.

    You can also simply change the database characterset to unicode and use non-N types.
   
    This last option might be quite a good solution, even if you don't want
    to touch older applications.
    consider this example:
   
      * you have currently an US7ASCII application and a US7ASCII database
        and now you are trying to implement for a new, browsser based, application
        a broader language support.
        so, your NLS_NCHAR_CHARACTERSET is AL16UTF16, but you run into this limit
        because your database characterset is still US7ASCII.
   
        Now, the main problem by switching the *database* characterset to
        UTF8 is that you have the column width problems with UTF8.
           
        This can be overcome quite easely by using CHAR semantics
                 But it requires a change to the current US7ASCII app never the less.
   
      An other option is:
      * change the datbase characterset to UTF8
      * use the NCHAR for the new languages
      * connect with your old application using a NLS_LANG set to US7ASCII
   
      This has this these advantages:
   
      * seen you NLS_LANG is US7ASCII you clients cannot insert anything
        else then US7ASCII, even if the database is UTF8.
 
       
        You don't need to worry about the column size problem
        because UTF8 is a strick superset of US7ASCII.
        (all 128 US7ASCII characters have the same codepoint in UTF8 as in US7ASCII
        and because of this, also take only one byte for one character in UTF8)
        Basicly you are using UTF8 in a very restricted "US7ASCII" mode when using
        US7ASCII as NLS_LANG on the client side.
     
        The only risk here is that there might be a client that connects
        with a NLS_LANG other then US7ASCII,
        but that's up to you to judge how likly this can be in your environment.
   
      * But now you DO can use the Nchar (or even "normal" CHAR columns, no real need
        anymore to use the National characterset) for the new application,
        seen the database characterset is now Unicode...
        Here you need to use of course a correct NLS_LANG, and typically
        you will be developing directly a Unicode (UTF8) application
        and leave the conversion itself to the browser on the client.
 

0
 

Author Comment

by:mikezhou
ID: 19961474
Schwertner, sorry for the delayed acceptance. I must be either not notice your last answer or busy at other things then. Thanks a lot for your detailed explaination and kind help.
0

Featured Post

Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

From implementing a password expiration date, to datatype conversions and file export options, these are some useful settings I've found in Jasper Server.
When it comes to protecting Oracle Database servers and systems, there are a ton of myths out there. Here are the most common.
This video shows syntax for various backup options while discussing how the different basic backup types work.  It explains how to take full backups, incremental level 0 backups, incremental level 1 backups in both differential and cumulative mode a…
This video shows how to recover a database from a user managed backup
Suggested Courses

632 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question