- For individual users
- Instant access to solutions
- Ask your tech questions
- Start your 30-day Free Trial
Main Topics
Browse All TopicsHi experts,
I'm currently facing an issue with a large temporary table (let's call it T_AGG_TEMP).
I'm working on a datawarehouse DB, this denormalized table is used to build new aggregate tables that share the same scope of analysis :
the temp table avoids repeating several filters and joins in all the subsequent aggregate tables.
The table is 4 Gigs in size, and has a regular index on COMPANY_CODE. It is truncated and reloaded each an every time.
Lately the team has been trying to parallelize the aggreagte tables computing.
The parallelised table creation scripts use a loop on COMPANY_CODE, and use INSERT INTO ... SELECT ... statements.
This works rather well, except that from time to time the create as select statements freeze and tend to execute in sequence rather than in parallel, due to heavy I/O activity (disk usage).
To optimize disk usage, my first idea was to partition the temp table, hence addressing a named partition and a list (or range) of COMPANY_CODEs rather than the whole table.
Unfortunately the customer does not have the Enterprise version (and hence no partitioning, no BITMAP indexes, ... :/ ), so I'll have to cope with the abilities of the basic version.
As I tend to think that segregating the data from the different companies is a serious candidate to reduce processing time in the second phase ( building aggregate tables based on T_AGG_TEMP),
I made an attempt by creating the T_AGG_TEMP table in a cluster (sized rougly 20% bigger than the T_AGG_TEMP table) indexed on COMPANY_CODE.
The problem is that loading the clustered version of the table takes at least 3 hours (I've canceled the creation statement each time, as the estimated remaining time in TOAD kept growing),
instead of about 45 minutes for the normal indexed version (the index is not dropped upon loading and recreated afterwards, for the moment).
I'm using the exact same script (an "insert into ... select ... ).
COMPANY_CODE has about 100 different values, which makes on average 40 Mo per COMPANY_CODE.
Does this disqualify the temp table for clustering ?
I tried several values for the SIZE parameter in my CREATE CLUSTER statement, but the books online say that this parameter is maxed with the OS block size (maybe 8k on this Win32 version ? )
Did I miss something ?
How should I try to segregate my data based on COMPANY_CODE ?
Any advice welcome ...
Thanks
Hilaire
This Question has been solved and asker verified All Experts Exchange premium technology solutions are available to subscription members.
Experts Exchange has been collecting answers to technology questions since 1996…3 million and counting! If you have a question, chances are we already have your answer.
If you can't find the exact answer you're looking for, ask our exclusive community of 50,000 experts. You’ll get a personalized answer from a trusted professional.
Thousands of free tech tips, tricks, how-to’s and tutorials are available in our peer reviewed articles section. See for yourself how smart our experts are, no login required.
Access the answers to your technology questions today.
30-day free trial. Register in 60 seconds.
Members of the expert community talk about why the experience at Experts Exchange is different than what you will find anywhere else.

Try it out and discover for yourself.
30-day free trial. Register in 60 seconds.
Join the community of experts here and help other tech pros by answering question in your area of expertise. You can earn FREE access to all Experts Exchange's premium features and resources.
Business Accounts
Answer for Membership
by: MikeOM_DBAPosted on 2008-08-18 at 08:44:54ID: 22252923
Segregate the data based on COMPANY_CODE.
And/Or try CREATE TABLE ...NOLOGGING AS SELECT instead of INSERT INTO ... SELECT ... statements.