Issue to store Chinese character to Oracle9i

Hi Friends,

I have a question regarding to storing chinese characters in oracle 9i (rel 2) db, I use unicode data type to implement it instead of a unicode database.

My current db settings for NLS are:
         PARAMETER      VALUE

My Client setting for NLS_LANG (system environment variables)

My operating system is Win2000 and its default locale is Chinese.

I can see chinese characters in SQLPLUS:
In SQLPLUS, I create a table
create table test (mynvarchar2  nvarchar2(10));
insert into test values ('[some chinese chars]');
select * from test;

It shows ???...

Any advice on the above ?

Best Regards,

Mike Zhou

Who is Participating?
schwertnerConnect With a Mentor Commented:
I'm inserting <special character> in a Nchar or Nvarchar2 col but it comes back as ? or ¿ ...

    All the data "embedded" in sql is first converted to the database characterset
    before being parsed, this means that if you have a (for example) a WE8IS8859P1
    database with an UTF8 *national* characterset, you will not be able to insert
    anything else than characters known by WE8ISO8859P1 into Nchar or Nvarchar2 fields.
    This does NOT mean that such a database is unable to *store* unicode in an
    N-type field, this is just a (pls)SQL parser restriction.

    Selecting existing data from N-type fields will work fine, it's only a
    problem with the insert.

    The workaround is here to:
    * use sqlldr to load the text from flat files (see point 18)
    * use export / import
    * use a Pro*C application or other language (it a (pls)SQL restriction)
    * use NCHAR_CS "insert into test values (chr(14844588 USING NCHAR_CS));"
    * use (from 9i onwards)
      - the Nchar: "insert into test values (nchr(50050));"
      - the UNISTR function: "insert into test values (UNISTR('\20AC'));"

    This limitation is lifted in 10g R2, under the condition that you prefix your
    NCHAR literals with N 'string literal' and you set the client side parameter
    ORA_NCHAR_LITERAL_REPLACE to TRUE (default=false)
    Then the strings will be preserved in 10g R2, note that you need a
    10g R2 server *and* client.

    This will NOT make a windows sqlplusw.exe a unicode application,
    example: you trying to insert chinese BIG5 in a WE8MSWIN1252 NLS_CHARACTERSET
    / AL16UTF16 NLS_NCHAR_CHARACTERSET database in NCHAR field using
    sqlplusw.exe .

    If you are using a West european windows system (ACP 1252) then you need
    to use a NLS_LANG set to WE8MSWIN1252, even when using N literals you will
    not be able to store chinese (that you copy from a unicode application like
    Microsoft Word for example) because sqlplus does not know chinese,
    it's a ANSI application.

    If you change the LOCALE (ACP) from the client to 963 (ZHS16GBK) and use a NLS_LANG
    set to ZHS16GBK then you will be able to insert chinese in a NCHAR because
    the *client* now knows how to handle chinese trough ZHS16GBK...
    The ORA_NCHAR_LITERAL_REPLACE = True will avoid the ZHS16GBK data coming
    from the Chinese windows client to be first converted to the WE8MSWIN1252
    NLS_CHARACTERSET, nothing more.

    You can also simply change the database characterset to unicode and use non-N types.
    This last option might be quite a good solution, even if you don't want
    to touch older applications.
    consider this example:
      * you have currently an US7ASCII application and a US7ASCII database
        and now you are trying to implement for a new, browsser based, application
        a broader language support.
        so, your NLS_NCHAR_CHARACTERSET is AL16UTF16, but you run into this limit
        because your database characterset is still US7ASCII.
        Now, the main problem by switching the *database* characterset to
        UTF8 is that you have the column width problems with UTF8.
        This can be overcome quite easely by using CHAR semantics
                 But it requires a change to the current US7ASCII app never the less.
      An other option is:
      * change the datbase characterset to UTF8
      * use the NCHAR for the new languages
      * connect with your old application using a NLS_LANG set to US7ASCII
      This has this these advantages:
      * seen you NLS_LANG is US7ASCII you clients cannot insert anything
        else then US7ASCII, even if the database is UTF8.
        You don't need to worry about the column size problem
        because UTF8 is a strick superset of US7ASCII.
        (all 128 US7ASCII characters have the same codepoint in UTF8 as in US7ASCII
        and because of this, also take only one byte for one character in UTF8)
        Basicly you are using UTF8 in a very restricted "US7ASCII" mode when using
        US7ASCII as NLS_LANG on the client side.
        The only risk here is that there might be a client that connects
        with a NLS_LANG other then US7ASCII,
        but that's up to you to judge how likly this can be in your environment.
      * But now you DO can use the Nchar (or even "normal" CHAR columns, no real need
        anymore to use the National characterset) for the new application,
        seen the database characterset is now Unicode...
        Here you need to use of course a correct NLS_LANG, and typically
        you will be developing directly a Unicode (UTF8) application
        and leave the conversion itself to the browser on the client.

Try updating the registry on the client m/c under software->oracle->NLS_LANG variable and see if that works
mikezhouAuthor Commented:
I have tried that (understand that system envrionment variable overwrite registry value) and the result still the same...
Upgrade your Question Security!

Your question, your audience. Choose who sees your identity—and your question—with question security.

Question to your answer is in Oracle9i gobalization Guide. Listed below is the link. Please take a look.

mikezhouAuthor Commented:
The doc is the exact document I am referencing before trying the above. I feel I follow all the steps stated in the doc...however I can not get the result as expected...I suspect something not obvious missing in my steps...
To store Chinese letters your DB has to be created with AL32UTF8 character set.
Or with some of the specific chinese character sets, but WE8MSWIN1252
doesn't support chinese (so far!)
mikezhouAuthor Commented:
I want a solution 'use unicode data type to implement it instead of a unicode database.' as just a very small fraction of my application will use Chinese, Japanese etc.
Sorry, this is not possible as far as I am informed!
The character set of a DB is defined at the beginning and cannot be changed.
It has global influence on the created DB and there couldn't be pieces of the DB
using another character sets. Of course applications can use different character sets, but
they should be subsets of the character set of the DB.

For your case I will suggest to create another DB with Unicode. If you have license isuues use
Oracle XE (free!, but only 4 GB user data).
mikezhouAuthor Commented:
Oracle provide this by the solution of unicode data types NCHAR, NVARCHAR2 and national character set.
A workaround is to do schema export (only non-systen users), to create another DB with AL32UTF8 character set and to import the schemas there. You can do the experiment running the existing DB and if you are satisfied to do this for production.
The only issue is to investigate all non-system schemas and tablespaces (they need precreation before Import), but sometimes this is a good new knowledge and also can improve the DB. Also a trap could be the value of NLS_LANG registry entry - it dictates the character set of Export and Import utilities.
mikezhouAuthor Commented:
Schwertner, sorry for the delayed acceptance. I must be either not notice your last answer or busy at other things then. Thanks a lot for your detailed explaination and kind help.
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.