Oracle 10g Turn off Table Logging During ETL

Dear Experts,

I have an Oracle 10g database (we'll call PITB) that is a reporting database. A change data capture solution keeps the data in PITB up-to-date.  Data is passed to PITB from an Informix database running on AIX RS6000 64-bit .  

On occasion, we need to run an ETL from the Informix database to our PITB which takes 22 hours.  This ETL time will increase over time as data continues to grow.

I've been trying to think of a way to speed up the ETL and have decided that it may help to turn off table logging for the specific tables being ETL'd.  The thought is that if there is no redo log switching happening and no archiving occurring during the ETL, then it will finish sooner.  Then table logging would be immediately turned on again after completion of the ETL.

The commands I was thinking to do are:

Turn off logging:    alter table xxxxx nologging;
(run the ETL)
Turn on logging:  alter table xxxxx logging;

The PITB database also captures data changes and streams them to another Oracle database.  Additionally, the PITB database uses dataguard to populate a DR copy of itself.  

I realize that by turning off table logging, that I would need to instantiate the downstream database as well as the DR database after the ETL completes and table logging is turned back on.  Although not appealing to me, this is acceptable if I'm able to make significant time gains in the ETL process.

I'm concerned that I may not be considering all pros & cons to turning off table logging for the ETL'd tables.

Can anyone shoot some holes in this idea and save me from a foolish act?
Can anyone offer alternatives to speeding up the ETL process?

The host server for the PITB database is an AIX RS6000 64-bit.
The source database/server is an Informix db on an AIX RS6000 64-bit.

The ETL process is being handled by iWay Data Migrator via a Windows 2003 32-bit server.

Julie KurpaSr. Systems ProgrammerAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Here are the questions I would be asking.

1.  Is informix database running the ETL or oracle database there might be implicit data conversion going on.

2.  What is the ETL doing for 22 hours?  figure out the time spent in each area (network, disk I/O, CPU) and if you get the Disk I/O as the biggest culprit then turn off the logging for the identified tables.  Make sure to have a backup after ETL is finished.

3.  If ETL is running during low system usage then Check if DML parallelism is being utilized for faster processing.

4.  see if you can drop indexes on the oracle side and rebuild them when done. (it is a better option than turning off logging)

what I do with my ETL is about 100GB of data processed in 2 hours across database link (oracle-oracle)

1.  Drop all indexes before I begin
2.  Run the ETL (inserts, updates, delete, transformations, etc).
3.  Validate the data( build constraints)
4.  Build the indexes.
Julie KurpaSr. Systems ProgrammerAuthor Commented:
Thanks Bajawa.

The ETl is being executed  on a Windows 2003 32-bit server through a tool called iWay Data Migrator.  In the Data Migrator we've set up things called data flows that have linked to both the Informix and oracle databases to create a data mapping.  That's where the conversion takes place and would certainly add to run time.

The ETL is being run for 231 tables at this time.  Each table has it's own Data Flow process that is executed.  We could have several data flows running at the same time (parallelism?) or only one.

The largest table of 41 million records is that one that takes 22 hours regardless of whether other dataflows are running or not .  

I will definitely look into dropping the indexes before the next ETL we perform (which may be tomorrow) and rebuild them afterward.  

Very excited to try this approach.
Wasim Akram ShaikCommented:
if you go with this approach what expert bajwa has suggested, want to add few points which you can consider..

in your ETL operations, specify insert /*+ append*/ into instead of normal insert statements, by doing this

Using this, Oracle bypasses the buffer cache and writes data directly to datafiles. which will save some time for data reading and writing to buffer cache by background processes.

instead of dropping the indexes, you can make the non-unique indexes as unusable, this will also save some time as you don't have to create index again..

alter index index_name unusable

alter session set skip_unusable_indexes=true

however , you can make the unique indexes as unusable, you have to drop them..!!

not to mention, as there may be some staging tables(i believe those tables are truncated before and after use)..

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Big Business Goals? Which KPIs Will Help You

The most successful MSPs rely on metrics – known as key performance indicators (KPIs) – for making informed decisions that help their businesses thrive, rather than just survive. This eBook provides an overview of the most important KPIs used by top MSPs.

Christoffer SwanströmPartnerCommented:
If you do ALTER TABLE ... NOLOGGING, it will only affect direct-path INSERTs, i.e. with the /*+ APPEND */ hint. UPDATEs, DELETEs and regular INSERTs will not be affected. For direct-path INSERTs the redo log will be minimized, but there will always be some.

However, I think you will not gain a lot by minimizing redo logging, your problem probably lies elsewhere. Could you give some more details on the table structure and what exactly the ETL process is doing?
Julie KurpaSr. Systems ProgrammerAuthor Commented:
hmmm you are giving me lots to think about.  

For the ETL process, we truncate the target tables so we are loading to a fresh slate.  I am assuming that the ETL process reads from the first record of each source table and writes it to the target table.  The tool we use, Data Migrator, is a gui based interface and I don't know the underlying command/process it uses to perform the ETL.  

The Load Options available to us in Data Migrator are:

Load Type Options:  
     Key Matching Logic (we've selected this one)
     Insert records from memory (tried this one and doesn't run faster)
     Bulk load utility via a disk file (we're reading from a database not a flat file)
     Enable Change Data Capture
     Slowly Changing Dimensions

If the record exists options:
    Include the record (we've selected this one)
    Reject the record
    Update the existing record
    Delete the existing record

If the record does not exist: Include the record (this is the only option)

Commit every 1000 rows.

There are no constraints on the target tables other than the primary key which exactly matches the source tables.  There are indexes on many of the tables.

I'm thinking that I may leave the primary key indexes enabled and make all other indexes unusable instead of dropping them. Is that a viable approach?
Julie KurpaSr. Systems ProgrammerAuthor Commented:
BTW a comment for Tosse,
I know that during our ETL process, Oracle generates hundreds of archivelogs.  so I'm not sure how to understand your comment.
Julie KurpaSr. Systems ProgrammerAuthor Commented: other thing I just realized.  Data Migrator requires the source and target tables have a primary key.  So it must be reading in primary key order and doing a sort of append to the target table.  Sorry I don't have a way to see the code being used.
Wasim Akram ShaikCommented:
yes, thats what i said, instead of dropping the non-unique indexes, you can make them unusable, that would save some of ur time in rebuilding them again..
Julie KurpaSr. Systems ProgrammerAuthor Commented:
Thanks everyone.
I've gotten the greenlight to run the ETL at the end of the day.  Planning to set all indexes to 'unusable' before the run.
Will update on Monday of results.
Wasim Akram ShaikCommented:
More points to be considered before you go with this activity....
---Planning to set all indexes to 'unusable' before the run.

As, I had mentioned in my earlier comment, this should be applied to only non-unique indexes,

As you cannot load into a segment with a UNIQUE index defined that is in UNUSABLE state, you'll get an error "ORA-26026: unique index ... initially in unusable state"

You have to drop and recreate the unique indexes.

If you want to use an UNIQUE index that is not maintained during the data load you need to drop it and re-create after the load.

There is a however a way to circumvent this: You can support a UNIQUE or PRIMARY KEY constraint by means of a non-unique index. This way you can either set the constraint deferrable, or disable the constraint before the load while keeping the index in unusable state and rebuilding the index and re-enabling the constraint afterwards.

Also note this point,

A truncate automatically makes the indexes usable

So while data loading if you do this,

 TRUNCATE TABLE <table_name>

The indexes will get automatically usable and you will not find any difference of first alter statement being fired..

So you should remember to modify your DDL statments as:


Hope it helps..!!!
Julie KurpaSr. Systems ProgrammerAuthor Commented:
With setting the indexes to unusable, the ETL time went from 22 hours to 20.5 hours.

I wish very much that it could go faster like Bajawa's ETL (100g in 2 hours?  brag brag brag.. :))
But its seems that going from Informix to Oracle and requiring the Windows iWay Data Migrator as the middle-man may be the bottle-neck.  

Any other thoughts from anyone?

Maybe Tosse could expand more on the comment about archiving and direct-path inserts?
Wasim Akram ShaikCommented:
have you tried the suggestion from my comment http:#a37728781, i already have told you about the direct path inserts...
Julie KurpaSr. Systems ProgrammerAuthor Commented:
Thanks wasimibm,  
In my comment above #A37729107  I tried to explain that I don't exactly have a SQL I can modify to do direct path inserts.   I'm using a tool called iWay Data Migrator that uses odbc to connect to both the Informix and Oracle databases and allow me to map two tables together for an ETL.  My ONLY experience with ETL'ing is through this tool.

The only thing close to a script that I can grab is the attached script obtained from inside Data Migrator.  The attached script is supposed to be a text version of what the interface built for me based on what I mapped through the GUI options for one of the 231 tables that is being ETL'd.
Wasim Akram ShaikCommented:
understood author, ok then you can't control the sql statements which are being fired as they are done via the tool..!!!

ok.. can't you even control the truncation of tables before the activity??
 i think tool should do  this task, as a part of its operation, if not then add this activiy before starting over and check out if this helps..
Julie KurpaSr. Systems ProgrammerAuthor Commented:
I did truncate the tables before running ETL.  I have always done that.  Sorry if I wasn't clear on that.
The only thing new I've done is make the indexes unusable which gained me a valuable 1.5 hours.  I'm very happy about that.  
Wondering if there is anything else I could try.  Obviously SQL code is not available to me.

Perhaps something with the ODBC configuration?
Julie KurpaSr. Systems ProgrammerAuthor Commented:
I've been digging around in the manuals for Data Migrator and there is indeed an option to use a direct-load approach when ETL the data.
I've set up a test to run tonight after hours on one of the larger tables (not the 20 hour one but a 4 hour one).  Hopefully I will have something positive to report in the morning.

Thanks for your help.
Wasim Akram ShaikCommented:
oh.. that was nice to hear. just to mention something, i hope you are doing it..

the order of the truncate and alter index unusable should be like this

first you have to truncate the table and after that only set the alter indexes unusable..

lets see how much time you are gonna save this time..!!!!
Julie KurpaSr. Systems ProgrammerAuthor Commented:
Thanks.  Yup..I was doing the truncate first and then the unusable thing.  worked very well.

here's the report on the direct load approach:
I tested it on a table that took 4 hours last week.  This time it took 43 minutes.  Wow!!

One thing that wasn't good is that right after I started the direct-load script, I couldn't do anything in the server like open other windows and such.  Couldn't even open the task bar. The server response was severely degraded.  

I will look further into other settings to see if I can limit how much memory gets sucked out by the direct-load approach here and try another test.  

This is very good progress.  :)
Wasim Akram ShaikCommented:
wow.. thats a good news vocogov, but I am afraid, I couldn't help you much in the regard of "The server response " because its totally pertaining to the hardware configuration of the system, how much memory available in the system, what is the paging and performance and load on CPU etc., things,

well, try out the things you had mentioned over, if you don't get much.. you can always open a related question, may be some capacity gurus will come over and help you with configuration setttings..!!!
Julie KurpaSr. Systems ProgrammerAuthor Commented:
I tried another test and this time set the commit rate to 1000 records instead of the default (I'm not sure what the default is).  It still ran in 43 minutes and the server showed no type of strain or poor response.  Everything worked beautifully.  I'm floored and can't wait to show the boss!
Thanks for you help!
Wasim Akram ShaikCommented:
Cool... I am glad that it worked...!!!

Also close this question, if you think you got the answer which you are looking for.!!!
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Oracle Database

From novice to tech pro — start learning today.