select taking too long

I have 2 tables with 100 million and 50 million rows and its taking forver to select below
is the explain plan please suggest what i can do to increase performance.

Plan hash value: 3589824144

| Id  | Operation              | Name          | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
|   0 | SELECT STATEMENT       |               |       |       |       |  2105K(100)|          |
|*  1 |  HASH JOIN RIGHT OUTER |               |   339M|   137G|       |  2105K  (2)| 07:01:06 |
|   2 |   TABLE ACCESS FULL    | S_CONTRACT    |   365 | 34675 |       |     4   (0)| 00:00:01 |
|*  3 |   HASH JOIN RIGHT OUTER|               |    92M|    29G|  2811M|  2101K  (2)| 07:00:22 |
|   4 |    TABLE ACCESS FULL   | S_TXN_HDR_ACT |    31M|  2448M|       | 89953   (3)| 00:18:00 |
|   5 |    TABLE ACCESS FULL   | S_TXN_ACTUAL  |    92M|    22G|       |   682K  (3)| 02:16:33 |

Predicate Information (identified by operation id):

   1 - access(TRIM("A"."CONTRACT_NUM")=TRIM("C"."CONTRACT_NUM"))
   3 - access("A"."DOC_YEAR"="B"."DOC_YEAR" AND "A"."DOC_NUM"="B"."DOC_NUM")
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

slightwv (䄆 Netminder) Commented:
Indexes on DOC_YEAR and DOC_NUM.

Possibly a function based index on: TRIM("A"."CONTRACT_NUM") and TRIM("C"."CONTRACT_NUM")
sam2929Author Commented:
Indexes are already there
what are your indexes?  are they normal indexes or the function-based ones suggested by slightwv?

from the plan you can see they aren't being used.  What is the clustering factor on them?

What is the actual query?
Ultimate Tool Kit for Technology Solution Provider

Broken down into practical pointers and step-by-step instructions, the IT Service Excellence Tool Kit delivers expert advice for technology solution providers. Get your free copy now.

Are the statistics gathered ?
Sorry, wrong question : I see the number of records mentioned.
slightwv (䄆 Netminder) Commented:
>>Are the statistics gathered ?

Corrected question:  Are the up to date?
sam2929Author Commented:
>>Are the statistics gathered ?

sam2929Author Commented:
Will composite index help?
slightwv (䄆 Netminder) Commented:
>>Will composite index help?

Probably not if the regular indexes will not unless both columns together produce enough cardinality.

Can you create a representative sample of your data in a test system that reproduces the fullt table scans that you have?  Then you can test a composit index.
Franck PachotCommented:

I'll suppose the estimated statistics are the right ones (Rows column).

So you join two big tables (about 20GB with a 2GB table)
The execution plan is the right one for that: the smaller table is hashed and the join is done by probing this hashed table while reading the large table.

But 2GB does not fit in memory so you probably see a lot of i/o to and from tempfiles.

Indexes will not help when you get millions of rows.

If you can partition both tables on DOC_YEAR then Oracle can do partition-wise joins: joining on smaller chunks that fit in memory. I suppose you have several years. Or you can subpartition on hash (DOC_NUM).

Then once partitioned you may improve even more the response time by using parallel query. You can try parallel query even when not partitioned, but you will have 2GB to send among processes...

If you can't change the design, maybe use manual workareas with a very big hash_area_size, but that is not easy. Or put tempfiles on the fastest disks you have.

But partitioning is the key to scalability here.

DavidSenior Oracle Database AdministratorCommented:
Not necessarily.  The full table scan is due to the function, so until that special index is created,  and used, you will not see much improvement.

Perhaps obvious, perhaps already addressed:
--  reduce the volume of data by moving anything stale (non essential) to partitions or archive storage
--  look into moving the BIG files to BIG tablespaces, where the I/O grab is optimized
--  look into making your query read-only,  or have the database up in noarchivelog mode
--  increase your session ARRAYSIZE parameter
-- increase your session SORT_AREA_SIZE system parameter

slightwv (䄆 Netminder) Commented:
I agree partioning might help if your data/system lends itself to it.

I just wanted to add that partitioning is not free.  You must purchase it.
what is the query?  what are the actual indexes you have in place?
sam2929Author Commented:
I will try doing partition by year can you please tell me who can i do partition by year

   (      "CLIENT_NUM" VARCHAR2(4 BYTE),
      "DOC_NUM" VARCHAR2(10 BYTE),
      "DOC_YEAR" NUMBER(4,0),
      "LINE_ITEM" NUMBER(3,0)
slightwv (䄆 Netminder) Commented:
Check the docs.  They have examples.

I would first look at a range partition (just use your number and don't worry about converting it to a date):

You might also look at a list partition.
PLEASE provide some more details about your query and your existing indexes

From the plan we can deduce it looks something like this....

  FROM s_txn_hdr_act a, s_txn_actual b, s_contract c
 WHERE     a.doc_year = b.doc_year
       AND a.doc_num = b.doc_num
       AND TRIM(a.contract_num) = TRIM(c.contract_num)

is contract_num distinct for s_contract?
is *  correct? If not, what columns are you actually pulling from each table?

again, what indexes do you have.  Partitioning may not be needed at all.

are you testing the real query, or are you testing a query that will be put into a view?

I ask this because you have no conditions on doc_year, which says to me you're pulling all records for all time.    Unless your data is all new, most people aren't generally interested in all of it, only recent stuff.
sam2929Author Commented:
Stuber : i wil paste full sql and indexes in few minutes

doing partition iike below as doc_year is numeric its not liking it

slightwv (䄆 Netminder) Commented:
>>doing partition iike below as doc_year is numeric its not liking it

It should work.

There are examples here:

What is the error?  Did you install partitioning?
sam2929Author Commented:
Indexes on table are
DavidSenior Oracle Database AdministratorCommented:
Remember that partitioning allows non-relevant data to be filtered out, or masked.  So IF you have the feature installed, moving data into partitions will involve a database model change, and more fun stuff down stream.

I suggest we step back to the point about creating a functional index on the column being trimmed.
sam2929Author Commented:
Error at Command Line:3 Column:0
Error report:
SQL Error: ORA-00439: feature not enabled: Partitioning
00439. 00000 -  "feature not enabled: %s"
*Cause:    The specified feature is not enabled.
*Action:   Do not attempt to use this feature.

Any idea who can i enable it?
what about s_contract?  is trim(contract_num ) a unique key for that table?
slightwv (䄆 Netminder) Commented:
>>Any idea who can i enable it?

Let's hold off on partitioning until we determine if simple indexes will solve your problem.

Simple answer: It needs to be installed with the server software then selected as an option when you build the database.  I'm sure it can be 'activated' after database creation but I would have to look it up in the docs.

Also remember, it is not free.  If you haven't paid to use it, you should not...
sam2929Author Commented:
try this...


and rewrite the query like this...
I know, I'm ignoring your large tables and addressing the small one, but if you're hitting that small one many times for results on the large ones this could help a lot.

SELECT TRIM(a.company_tcd),
       (SELECT TRIM(c.contract_desc)
          FROM s_contract c
         WHERE TRIM(a.contract_num) = TRIM(c.contract_num)),
  FROM s_txn_actual a
       LEFT OUTER JOIN s_txn_hdr_act b
           ON a.doc_num = b.doc_num AND a.doc_year = b.doc_year AND a.company_tcd = b.company_tcd
Franck PachotCommented:

>> SQL Error: ORA-00439: feature not enabled: Partitioning
So you don't have partitioning option. Maybe you are on Standard Edition.

I don't see how indexes can help to get millions of rows. But maybe the estimated cardinality is wrong. So let's check the right numbers and exactly where the bottleneck is.

Please run:
alter session set statistics_level=all;

Open in new window

then your query
select * from table(dbms_xplan.display_cursor('','','ALLSTATS LAST'))

Open in new window

That will show the actual execution plan statistics.

sam2929Author Commented:
Problem is here

if i do select select count(*) from S_TXN_ACTUAL takes 2-3 minutes
Now if i do below count taking forever

select count(*) FROM s_txn_actual a
       LEFT OUTER JOIN s_txn_hdr_act b
           ON a.doc_num = b.doc_num AND a.doc_year = b.doc_year AND a.company_tcd = b.company_tcd
sam2929Author Commented:
The query which you send me is even taking more longer then original one
did you create the index to go with it?
sam2929Author Commented:
Javier MoralesOracle DBACommented:
Well, looks like these two tables fit to the "cluster segment" scenario.

At a glance, if doc_num, doc_year and company_tcd are the main "primary/foreign key" of the join, recovering the entire rows from both tables combined will take that much surely. 50 millions crossed with 100 millions with outer join means what explain plan is approaching, with an additional table in between that makes it harder to tune.

I mean, considering the following volumes:

1.- table S_TXN_HDR_ACT recovers 31Million (estimated)
2.- table S_TXN_ACTUAL recovers 92Million (estimated)
crossing them by outer join results 92 million rows (step 3 of the plan).
3.- for each result, crossing with S_CONTRACT would result 365 rows each access, and a total estimated 339 million rows.

sdstuber guessed that a correlation subquery to make these "365 rows" estimated in memory would be better solved like a "hash join", linking the number of rows from S_CONTRACT that matches with that 92millions row set, and it should have work. Why it didn't?

Well, my question is:
Does table B has CONTRACT_NUM column? Why joining (by outer join) table C with A about CONTRACT_NUM? does not table A,B and C should match the same CONTRACT_NUM?

Oracle CBO does not guess it, and assumes the following (because maybe that was your intention coding it that way) that:

Table A and table B join with two collumns. DOC_YEAR and DOC_NUM by outer join
(maybe table B does not have CONTRACT_NUM)
and Table A and table C join with CONTRACT_NUM column, by outer join too, and nothing ensures that DOC_NUM and DOC_YEAR in the same CONTRACT_NUM do match, and so, the CBO considers this path of join as the optimal way to get the rows.

If table B does have the column CONTRACT_NUM (or, if it's not, a DOC_YEAR and DOC_NUM belongs to the same CONTRACT_NUM, please add the column to table B), please include it in the join clauses, to make:

Table A joins to table B by

and Table A/B joins to table C by

if not, that's why Oracle ignore indexes based on CONTRACT_NUM column.

And, after that, if joining all that million rows isn't fast enough, you should consider to create a INDEXED CLUSTER with S_TXN_HDR_ACT and S_TXN_ACTUAL, where DOC_NUM, DOC_YEAR (and probably CONTRACT_NUM) be the cluster key to "join them physically" and make the joined rows come out immediately.

Take a look at it:
Oracle Doc. about Indexed Clusters.

Hope this helps, kind regards,

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Franck PachotCommented:

To summarize, the possibility to improbe the join over million of rows are:
- cluster (as given by Javier) but difficult for such big tables where you don't know in advance  the size in front of each key.
- partitions. Ok it's an option to by, but you can't manage such big tables at free
- maybe IOT having the unused columns in overflow segment. In case Oracle can do a merge join without having to sort.

well... any way to have rows physically grouped on the partition key.

Did you tried a large hash_area_size in workarea_size_policy=manual ? Hash join may be ok if you don't have to do several passes to tempfiles. Parallel query may help as well.

Now, what is the requirement behind that ? what do you do with that huge amount of data ?

It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Oracle Database

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.