Solved

Use SQL*Loader to load unicode text file

Posted on 2013-10-23
9
1,375 Views
Last Modified: 2013-11-02
I have a large text file that needs to be loaded into Oracle.  I'm planning to use SQL*Loader to load it.  Following is my command.

sqlldr userid=xxx/xxx@xxx control=test.ctl bad=test.bad data=test.data log=test.log

Open in new window


My control file looks like this

OPTIONS (
    ERRORS = 10000,
    PARALLEL = TRUE,
    DIRECT = TRUE,
    readsize=104857600,
    columnarrayrows=100000,
    streamsize=2560000,
    bindsize=2560000)
LOAD DATA APPEND INTO TABLE table
FIELDS TERMINATED BY '^A'
TRAILING NULLCOLS (
    [list_of_columns]
)

Open in new window


The result is there were a small number of rows got rejected.  I looked at the bad file and all of them are unicode text.

I'm just wondering what I should do in the control file so as to be able to load those unicode data as well.

Thanks.
0
Comment
Question by:thomaszhwang
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 4
  • 4
9 Comments
 
LVL 77

Assisted Solution

by:slightwv (䄆 Netminder)
slightwv (䄆 Netminder) earned 400 total points
ID: 39597653
Can you provide some sample data?

Some 'good' and some 'bad'.
0
 

Author Comment

by:thomaszhwang
ID: 39598386
The data is pretty sensitive, so I cannot post them here, but I'm sure my description should be correct.  It's a problem w/ unicode characters.  Thanks.
0
 
LVL 77

Assisted Solution

by:slightwv (䄆 Netminder)
slightwv (䄆 Netminder) earned 400 total points
ID: 39598400
Likely a database characterset conversion issue.

Maybe another Expert has a guess but without data, I don't.

If the database can support Unicode and you are loading this from a remote client, maybe all you need to do is set NLS_LANG to the correct value for the database you are using.

The online docs talk about setting this.
0
Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 

Author Comment

by:thomaszhwang
ID: 39598632
My database actually supports Unicode.  I can actually use another way to load the same data - I can create an external table pointing to the data file and do a INSERT ... SELECT * FROM external_table.  Everything will be loaded fine in this case.  This is why I'm thinking I forget some switch in the control file for SQL*Loader.
0
 
LVL 77

Assisted Solution

by:slightwv (䄆 Netminder)
slightwv (䄆 Netminder) earned 400 total points
ID: 39598653
External tables pretty much use the same sql loader engine.  It should almost be plug and play.

When you posted "userid=xxx/xxx@xxx"  The @xxx shows a remote database connection.

If the file is on the database server and you are using the same Oracle binaries, it shouldn't matter but you never know...

Try setting NLS_LANG.
0
 

Author Comment

by:thomaszhwang
ID: 39598859
Could you please tell me how to set NLS_LANG?  Thanks.
0
 
LVL 1

Assisted Solution

by:hinpong
hinpong earned 100 total points
ID: 39599192
According to SQL Loader manual,  you can specifiy characterset=UTF16 to load unicode
0
 
LVL 77

Accepted Solution

by:
slightwv (䄆 Netminder) earned 400 total points
ID: 39599233
Here is where it is referenced in the sql loader docs:
http://docs.oracle.com/cd/E18283_01/server.112/e16536/ldr_concepts.htm

Stream Record Format
If the character set specified with the NLS_LANG parameter for your session is different from the character set of the data file, then character strings are converted to the character set of the data file. This is done before SQL*Loader checks for the default record terminator.


Setting it:
http://docs.oracle.com/cd/E11882_01/server.112/e10729/ch3globenv.htm#i1006280
0
 

Author Closing Comment

by:thomaszhwang
ID: 39619101
Thanks.
0

Featured Post

Free Tool: Postgres Monitoring System

A PHP and Perl based system to collect and display usage statistics from PostgreSQL databases.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Background In several of the companies I have worked for, I noticed that corporate reporting is off loaded from the production database and done mainly on a clone database which needs to be kept up to date daily by various means, be it a logical…
From implementing a password expiration date, to datatype conversions and file export options, these are some useful settings I've found in Jasper Server.
This video explains at a high level about the four available data types in Oracle and how dates can be manipulated by the user to get data into and out of the database.
This video shows syntax for various backup options while discussing how the different basic backup types work.  It explains how to take full backups, incremental level 0 backups, incremental level 1 backups in both differential and cumulative mode a…

762 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question