Trouble with unicode and Oracle 8.0

I am trying to get unicode strings into an Oracle 8.0 database but am having difficulties. If I use SQLPlus and copy and paste the insert statement into the database, any special characters are inserted as ? or something like that. I then try to call a sql file that has been encoded to UTF-8 and the outcome is the same. I wrote a small ColdFusion page that parses through a sql statement and executes it against the database, but I haven't found an ODBC driver that I can connect to Oracle 8.0 that also supports unicode.

I have heard a little bit about having to "prefix" unicode strings with a literal N (see http://www.winnetmag.com/Article/ArticleID/14399/14399.html)  but I have no idea what this means.

Does anyone know how to get unicode data into an Oracle 8.0 database? I'm going to be bald if I don't figure this one out soon!

Many thanks.
SK
KingjavaAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

slightwv (䄆 Netminder) Commented:
What is the characterset of the database?

select * from nls_database_parameters;

And you are looking for the following entries:
NLS_CHARACTERSET               WE8MSWIN1252
NLS_NCHAR_CHARACTERSET         AL16UTF16

To properly store this information in fields like varchar2, the charactersets have to match.  If these match, you then need to look at the NLS_LANG client variable.  This tells Oracle how to translate the data between the client and server.

In doing some reading on this in the past, it appears that if you use CLOB datatypes, a lot of this is taken care of for you.  The drawback:  ODBC doesn't support CLOBs.

I would look into using either the OO4O (Oracle objects for OLE) or the OLEDB drivers over ODBC.
0
KingjavaAuthor Commented:
Both NLS_CHARACTERSET and NLS_NCHAR_CHARACTER are UTF8.
0
konektorCommented:
open your sql script in notepad.
change characters you want to store to that one, u want - in insert scripts, some unicodes
save the document, but in save dialog, choose field encoding to unicode (default is ansi, there are 3 unicodes, i don't remember which is correct)
run script in sqlplus ...
0
Big Business Goals? Which KPIs Will Help You

The most successful MSPs rely on metrics – known as key performance indicators (KPIs) – for making informed decisions that help their businesses thrive, rather than just survive. This eBook provides an overview of the most important KPIs used by top MSPs.

slightwv (䄆 Netminder) Commented:
What is the nls_lang on your client set to?
0
KingjavaAuthor Commented:
how do I check the nls_lang on the client? FYI. the client is run on a linux box.
0
slightwv (䄆 Netminder) Commented:
If memory serves me right, it's an environment variable on UNIX (double check this with the docs).

As a side note:  Just because sqlplus displays the character as '?' doesn't necessarily mean it is stored incorrectly.  It may mean that sqlplus just doesn’t know how to display it.

To verify this, try:
select rawtohex(<column>) from <table>; -- or maybe use: rawtohex(substr(<column>,<start_pos>,<length>)) to shorten the output
0
swolickiCommented:

To store Unicode UTF-8 codes in Oracle8's (VAR)CHAR2 variables you have two choices, provided NLS_CHARACTERSET in NLS_DATABASE_PARAMETERS is UTF8:

1. write the SQL script in UTF-8.   You can use Notepad on W2K and transfer the script to Linux or use an UTF-8-capable editor on Linux.  Note, in Notepad use "UTF-8" in Encoding field of the Save dialog box.  Do not use "Unicode" as it would create an UTF-16 file.

Run the script in SQL*Plus with NLS_LANG=.UTF8 set in the environment.

e.g. export NLS_LANG=.UTF8  (or setenv in csh)

You will not be able to see the non-ASCII letters correctly on your Linux terminal unless the terminal is configured for UTF-8.  You can use the SELECT DUMP(column) to verify codes in the database.

2. use the CHR function, that is:

   CHR(<code1>) || CHR(<code 2>) || ....

where <code n> is a decimal number calculated from the UTF-8 code bytes of a character by treating the bytes (1, 2 or 3 bytes depending on the character) as a big endian integer:

for 1-byte codes:  code n = byte value (in decimal)
for 2-byte codes:  code n = 256 * 1st byte value + 2nd byte value
for 3-byte codes:  code n = 65536 * 1st byte + 256 * 2nd byte + 3rd byte

NLS_LANG is irrelevant in such case.

You will not be able to see the non-ASCII letters correctly on your Linux terminal unless the terminal is configured for UTF-8.  You can use the SELECT DUMP(column) to verify codes in the database.

In this option you will not be able to see the non-ASCII letters correctly on your Linux terminal either, unless the terminal is configured for UTF-8.  You can use the SELECT DUMP(column) to verify codes in the database.


Oracle ODBC (Windows) supports Unicode starting from version 8.1.5.5.  This is the support for Unicode (UTF-16) Windows API.


Best regards,

Sergiusz
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Oracle Database

From novice to tech pro — start learning today.