Solved

Use SQL*Loader to load unicode text file

Posted on 2013-10-23
9
1,339 Views
Last Modified: 2013-11-02
I have a large text file that needs to be loaded into Oracle.  I'm planning to use SQL*Loader to load it.  Following is my command.

sqlldr userid=xxx/xxx@xxx control=test.ctl bad=test.bad data=test.data log=test.log

Open in new window


My control file looks like this

OPTIONS (
    ERRORS = 10000,
    PARALLEL = TRUE,
    DIRECT = TRUE,
    readsize=104857600,
    columnarrayrows=100000,
    streamsize=2560000,
    bindsize=2560000)
LOAD DATA APPEND INTO TABLE table
FIELDS TERMINATED BY '^A'
TRAILING NULLCOLS (
    [list_of_columns]
)

Open in new window


The result is there were a small number of rows got rejected.  I looked at the bad file and all of them are unicode text.

I'm just wondering what I should do in the control file so as to be able to load those unicode data as well.

Thanks.
0
Comment
Question by:thomaszhwang
  • 4
  • 4
9 Comments
 
LVL 76

Assisted Solution

by:slightwv (䄆 Netminder)
slightwv (䄆 Netminder) earned 400 total points
ID: 39597653
Can you provide some sample data?

Some 'good' and some 'bad'.
0
 

Author Comment

by:thomaszhwang
ID: 39598386
The data is pretty sensitive, so I cannot post them here, but I'm sure my description should be correct.  It's a problem w/ unicode characters.  Thanks.
0
 
LVL 76

Assisted Solution

by:slightwv (䄆 Netminder)
slightwv (䄆 Netminder) earned 400 total points
ID: 39598400
Likely a database characterset conversion issue.

Maybe another Expert has a guess but without data, I don't.

If the database can support Unicode and you are loading this from a remote client, maybe all you need to do is set NLS_LANG to the correct value for the database you are using.

The online docs talk about setting this.
0
PRTG Network Monitor: Intuitive Network Monitoring

Network Monitoring is essential to ensure that computer systems and network devices are running. Use PRTG to monitor LANs, servers, websites, applications and devices, bandwidth, virtual environments, remote systems, IoT, and many more. PRTG is easy to set up & use.

 

Author Comment

by:thomaszhwang
ID: 39598632
My database actually supports Unicode.  I can actually use another way to load the same data - I can create an external table pointing to the data file and do a INSERT ... SELECT * FROM external_table.  Everything will be loaded fine in this case.  This is why I'm thinking I forget some switch in the control file for SQL*Loader.
0
 
LVL 76

Assisted Solution

by:slightwv (䄆 Netminder)
slightwv (䄆 Netminder) earned 400 total points
ID: 39598653
External tables pretty much use the same sql loader engine.  It should almost be plug and play.

When you posted "userid=xxx/xxx@xxx"  The @xxx shows a remote database connection.

If the file is on the database server and you are using the same Oracle binaries, it shouldn't matter but you never know...

Try setting NLS_LANG.
0
 

Author Comment

by:thomaszhwang
ID: 39598859
Could you please tell me how to set NLS_LANG?  Thanks.
0
 
LVL 1

Assisted Solution

by:hinpong
hinpong earned 100 total points
ID: 39599192
According to SQL Loader manual,  you can specifiy characterset=UTF16 to load unicode
0
 
LVL 76

Accepted Solution

by:
slightwv (䄆 Netminder) earned 400 total points
ID: 39599233
Here is where it is referenced in the sql loader docs:
http://docs.oracle.com/cd/E18283_01/server.112/e16536/ldr_concepts.htm

Stream Record Format
If the character set specified with the NLS_LANG parameter for your session is different from the character set of the data file, then character strings are converted to the character set of the data file. This is done before SQL*Loader checks for the default record terminator.


Setting it:
http://docs.oracle.com/cd/E11882_01/server.112/e10729/ch3globenv.htm#i1006280
0
 

Author Closing Comment

by:thomaszhwang
ID: 39619101
Thanks.
0

Featured Post

PRTG Network Monitor: Intuitive Network Monitoring

Network Monitoring is essential to ensure that computer systems and network devices are running. Use PRTG to monitor LANs, servers, websites, applications and devices, bandwidth, virtual environments, remote systems, IoT, and many more. PRTG is easy to set up & use.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Cursors in Oracle: A cursor is used to process individual rows returned by database system for a query. In oracle every SQL statement executed by the oracle server has a private area. This area contains information about the SQL statement and the…
This post first appeared at Oracleinaction  (http://oracleinaction.com/undo-and-redo-in-oracle/)by Anju Garg (Myself). I  will demonstrate that undo for DML’s is stored both in undo tablespace and online redo logs. Then, we will analyze the reaso…
This video explains at a high level with the mandatory Oracle Memory processes are as well as touching on some of the more common optional ones.
This video explains what a user managed backup is and shows how to take one, providing a couple of simple example scripts.

776 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question