Solved

select taking too long

Posted on 2013-01-17
31
564 Views
Last Modified: 2013-02-05
Hi,
I have 2 tables with 100 million and 50 million rows and its taking forver to select below
is the explain plan please suggest what i can do to increase performance.


Plan hash value: 3589824144

------------------------------------------------------------------------------------------------
| Id  | Operation              | Name          | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |               |       |       |       |  2105K(100)|          |
|*  1 |  HASH JOIN RIGHT OUTER |               |   339M|   137G|       |  2105K  (2)| 07:01:06 |
|   2 |   TABLE ACCESS FULL    | S_CONTRACT    |   365 | 34675 |       |     4   (0)| 00:00:01 |
|*  3 |   HASH JOIN RIGHT OUTER|               |    92M|    29G|  2811M|  2101K  (2)| 07:00:22 |
|   4 |    TABLE ACCESS FULL   | S_TXN_HDR_ACT |    31M|  2448M|       | 89953   (3)| 00:18:00 |
|   5 |    TABLE ACCESS FULL   | S_TXN_ACTUAL  |    92M|    22G|       |   682K  (3)| 02:16:33 |
------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access(TRIM("A"."CONTRACT_NUM")=TRIM("C"."CONTRACT_NUM"))
   3 - access("A"."DOC_YEAR"="B"."DOC_YEAR" AND "A"."DOC_NUM"="B"."DOC_NUM")
0
Comment
Question by:sam2929
  • 11
  • 7
  • 6
  • +4
31 Comments
 
LVL 76

Expert Comment

by:slightwv (䄆 Netminder)
ID: 38787826
Indexes on DOC_YEAR and DOC_NUM.

Possibly a function based index on: TRIM("A"."CONTRACT_NUM") and TRIM("C"."CONTRACT_NUM")
0
 

Author Comment

by:sam2929
ID: 38787871
Indexes are already there
0
 
LVL 73

Expert Comment

by:sdstuber
ID: 38787906
what are your indexes?  are they normal indexes or the function-based ones suggested by slightwv?

from the plan you can see they aren't being used.  What is the clustering factor on them?

What is the actual query?
0
 
LVL 20

Expert Comment

by:flow01
ID: 38787933
Are the statistics gathered ?
Sorry, wrong question : I see the number of records mentioned.
0
 
LVL 76

Expert Comment

by:slightwv (䄆 Netminder)
ID: 38787945
>>Are the statistics gathered ?

Corrected question:  Are the up to date?
0
 

Author Comment

by:sam2929
ID: 38788028
>>Are the statistics gathered ?

 Yes
0
 

Author Comment

by:sam2929
ID: 38788030
Will composite index help?
0
 
LVL 76

Expert Comment

by:slightwv (䄆 Netminder)
ID: 38788049
>>Will composite index help?

Probably not if the regular indexes will not unless both columns together produce enough cardinality.

Can you create a representative sample of your data in a test system that reproduces the fullt table scans that you have?  Then you can test a composit index.
0
 
LVL 15

Expert Comment

by:Franck Pachot
ID: 38788050
Hi,

I'll suppose the estimated statistics are the right ones (Rows column).

So you join two big tables (about 20GB with a 2GB table)
The execution plan is the right one for that: the smaller table is hashed and the join is done by probing this hashed table while reading the large table.

But 2GB does not fit in memory so you probably see a lot of i/o to and from tempfiles.

Indexes will not help when you get millions of rows.

If you can partition both tables on DOC_YEAR then Oracle can do partition-wise joins: joining on smaller chunks that fit in memory. I suppose you have several years. Or you can subpartition on hash (DOC_NUM).

Then once partitioned you may improve even more the response time by using parallel query. You can try parallel query even when not partitioned, but you will have 2GB to send among processes...

If you can't change the design, maybe use manual workareas with a very big hash_area_size, but that is not easy. Or put tempfiles on the fastest disks you have.

But partitioning is the key to scalability here.

Regards,
Franck.
0
 
LVL 23

Expert Comment

by:David
ID: 38788074
Not necessarily.  The full table scan is due to the function, so until that special index is created,  and used, you will not see much improvement.

Perhaps obvious, perhaps already addressed:
--  reduce the volume of data by moving anything stale (non essential) to partitions or archive storage
--  look into moving the BIG files to BIG tablespaces, where the I/O grab is optimized
--  look into making your query read-only,  or have the database up in noarchivelog mode
--  increase your session ARRAYSIZE parameter
-- increase your session SORT_AREA_SIZE system parameter

HTH
0
 
LVL 76

Expert Comment

by:slightwv (䄆 Netminder)
ID: 38788077
I agree partioning might help if your data/system lends itself to it.

I just wanted to add that partitioning is not free.  You must purchase it.
0
 
LVL 73

Expert Comment

by:sdstuber
ID: 38788095
what is the query?  what are the actual indexes you have in place?
0
 

Author Comment

by:sam2929
ID: 38788475
I will try doing partition by year can you please tell me who can i do partition by year

CREATE TABLE "AA"
   (      "CLIENT_NUM" VARCHAR2(4 BYTE),
      "COMPANY_TCD" VARCHAR2(4 BYTE),
      "DOC_NUM" VARCHAR2(10 BYTE),
      "DOC_YEAR" NUMBER(4,0),
      "LINE_ITEM" NUMBER(3,0)
   )
0
 
LVL 76

Expert Comment

by:slightwv (䄆 Netminder)
ID: 38788506
Check the docs.  They have examples.

I would first look at a range partition (just use your number and don't worry about converting it to a date):
http://docs.oracle.com/cd/E11882_01/server.112/e25789/schemaob.htm#CNCPT88862

You might also look at a list partition.
0
 
LVL 73

Expert Comment

by:sdstuber
ID: 38788564
PLEASE provide some more details about your query and your existing indexes


From the plan we can deduce it looks something like this....


SELECT *
  FROM s_txn_hdr_act a, s_txn_actual b, s_contract c
 WHERE     a.doc_year = b.doc_year
       AND a.doc_num = b.doc_num
       AND TRIM(a.contract_num) = TRIM(c.contract_num)

is contract_num distinct for s_contract?
is *  correct? If not, what columns are you actually pulling from each table?

again, what indexes do you have.  Partitioning may not be needed at all.


are you testing the real query, or are you testing a query that will be put into a view?

I ask this because you have no conditions on doc_year, which says to me you're pulling all records for all time.    Unless your data is all new, most people aren't generally interested in all of it, only recent stuff.
0
PRTG Network Monitor: Intuitive Network Monitoring

Network Monitoring is essential to ensure that computer systems and network devices are running. Use PRTG to monitor LANs, servers, websites, applications and devices, bandwidth, virtual environments, remote systems, IoT, and many more. PRTG is easy to set up & use.

 

Author Comment

by:sam2929
ID: 38788691
Stuber : i wil paste full sql and indexes in few minutes

doing partition iike below as doc_year is numeric its not liking it

PARTITION BY RANGE (DOC_YEAR)
(PARTITION DOC_YEAR_2010 VALUES LESS THAN (2010)  
 );
0
 
LVL 76

Expert Comment

by:slightwv (䄆 Netminder)
ID: 38788718
>>doing partition iike below as doc_year is numeric its not liking it

It should work.

There are examples here:
http://www.orafaq.com/wiki/Range_partitioning

What is the error?  Did you install partitioning?
0
 

Author Comment

by:sam2929
ID: 38788744
Indexes on table are
ALTER TABLE "S_TXN_ACTUAL" ADD CONSTRAINT "S_TXN_ACTUAL_PK" PRIMARY KEY ("CLIENT_NUM", "COMPANY_TCD", "DOC_NUM", "DOC_YEAR", "LINE_ITEM")
ALTER TABLE  "S_TXN_HDR_ACT" ADD CONSTRAINT "S_TXN_HDR_ACT_PK" PRIMARY KEY ("CLIENT_NUM", "COMPANY_TCD", "DOC_NUM", "DOC_YEAR")
query.txt
0
 
LVL 23

Expert Comment

by:David
ID: 38788749
Remember that partitioning allows non-relevant data to be filtered out, or masked.  So IF you have the feature installed, moving data into partitions will involve a database model change, and more fun stuff down stream.

I suggest we step back to the point about creating a functional index on the column being trimmed.
0
 

Author Comment

by:sam2929
ID: 38788770
Error at Command Line:3 Column:0
Error report:
SQL Error: ORA-00439: feature not enabled: Partitioning
00439. 00000 -  "feature not enabled: %s"
*Cause:    The specified feature is not enabled.
*Action:   Do not attempt to use this feature.

Any idea who can i enable it?
0
 
LVL 73

Expert Comment

by:sdstuber
ID: 38788774
what about s_contract?  is trim(contract_num ) a unique key for that table?
0
 
LVL 76

Expert Comment

by:slightwv (䄆 Netminder)
ID: 38788840
>>Any idea who can i enable it?

Let's hold off on partitioning until we determine if simple indexes will solve your problem.

Simple answer: It needs to be installed with the server software then selected as an option when you build the database.  I'm sure it can be 'activated' after database creation but I would have to look it up in the docs.

Also remember, it is not free.  If you haven't paid to use it, you should not...
0
 

Author Comment

by:sam2929
ID: 38788868
yes
CREATE UNIQUE INDEX "FINSTAGEADM"."S_CONTRACT_PK" ON "FINSTAGEADM"."S_CONTRACT" ("CONTRACT_NUM")
0
 
LVL 73

Expert Comment

by:sdstuber
ID: 38789096
try this...

create unique index  "FINSTAGEADM"."S_CONTRACT_UK_FTRIM" ON "FINSTAGEADM"."S_CONTRACT" (TRIM("CONTRACT_NUM"))

and rewrite the query like this...
I know, I'm ignoring your large tables and addressing the small one, but if you're hitting that small one many times for results on the large ones this could help a lot.



SELECT TRIM(a.company_tcd),
       TRIM(a.doc_num),
       a.doc_year,
       a.line_item,
       TRIM(a.deb_cred_ind),
       a.amount,
       TRIM(a.re_code),
       TRIM(a.text),
       TRIM(a.payment_terms),
       TRIM(a.gl_acct_tcd),
       TRIM(a.valuation_type),
       TRIM(a.assignment),
       a.quantity,
       TRIM(a.unit_of_msr_tcd),
       TRIM(a.tax_jurist),
       TRIM(a.busn_area_tcd),
       a.clrng_date,
       a.clrng_ent_date,
       a.clrng_document,
       TRIM(a.tax_cd),
       TRIM(a.purch_doc_tcd),
       TRIM(a.pdoc_itm_num),
       TRIM(a.billng_doc),
       TRIM(a.valuatn_area),
       TRIM(a.paymnt_ref),
       TRIM(a.ref_key_1),
       TRIM(a.ref_key_2),
       TRIM(a.ref_key_3),
       TRIM(a.sales_doc),
       TRIM(a.sales_doc_itm),
       TRIM(a.txn_type),
       TRIM(a.prtnr_busn_area),
       TRIM(a.prtnr_profit_cntr),
       TRIM(a.profit_cntr_tcd),
       TRIM(a.post_key),
       TRIM(a.post_cat_tcd),
       TRIM(a.post_cat_desc),
       TRIM(a.post_typ_tcd),
       TRIM(a.post_typ_desc),
       TRIM(a.actv_typ_tcd),
       TRIM(a.personnel_num),
       TRIM(a.order_tcd),
       TRIM(a.cost_cntr_tcd),
       TRIM(a.post_stat),
       TRIM(b.doc_type),
       b.post_date,
       b.post_period,
       TRIM(b.REF),
       TRIM(b.ref_key),
       TRIM(b.doc_hdr_text),
       TRIM(b.revrsd_with),
       TRIM(b.revrsd_rsn),
       TRIM(b.doc_status),
       TRIM(b.doc_status_desc),
       TRIM(b.doc_type_desc),
       TRIM(b.revrsd_rsn_desc),
       TRIM(b.doc_year),
       TRIM(a.asset_txn_type),
       a.cheque_num,
       (SELECT TRIM(c.contract_desc)
          FROM s_contract c
         WHERE TRIM(a.contract_num) = TRIM(c.contract_num)),
       TRIM(a.contract_num),
       TRIM(a.post_key_desc),
       TRIM(a.ref_org_unit),
       TRIM(a.tax_cd_desc),
       TRIM(a.asset_num_tcd),
       TRIM(a.customer_tcd),
       TRIM(a.material_tcd),
       TRIM(a.plant_tcd),
       TRIM(a.project_tcd),
       TRIM(a.vendor_tcd),
       a.network_tcd,
       a.ps_activity_tcd,
       a.wbs_elem_tcd,
       TRIM(a.building_tcd),
       TRIM(a.busn_entity_tcd),
       TRIM(a.property_tcd),
       TRIM(a.re_code),
       TRIM(a.rental_unit_tcd),
       TRIM(a.contrllng_area)
  FROM s_txn_actual a
       LEFT OUTER JOIN s_txn_hdr_act b
           ON a.doc_num = b.doc_num AND a.doc_year = b.doc_year AND a.company_tcd = b.company_tcd
0
 
LVL 15

Expert Comment

by:Franck Pachot
ID: 38789220
Hi,

>> SQL Error: ORA-00439: feature not enabled: Partitioning
So you don't have partitioning option. Maybe you are on Standard Edition.

I don't see how indexes can help to get millions of rows. But maybe the estimated cardinality is wrong. So let's check the right numbers and exactly where the bottleneck is.

Please run:
alter session set statistics_level=all;

Open in new window

then your query
then:
select * from table(dbms_xplan.display_cursor('','','ALLSTATS LAST'))

Open in new window


That will show the actual execution plan statistics.

thanks,
Franck.
0
 

Author Comment

by:sam2929
ID: 38789258
Problem is here

if i do select select count(*) from S_TXN_ACTUAL takes 2-3 minutes
Now if i do below count taking forever

select count(*) FROM s_txn_actual a
       LEFT OUTER JOIN s_txn_hdr_act b
           ON a.doc_num = b.doc_num AND a.doc_year = b.doc_year AND a.company_tcd = b.company_tcd
0
 

Author Comment

by:sam2929
ID: 38789329
The query which you send me is even taking more longer then original one
0
 
LVL 73

Expert Comment

by:sdstuber
ID: 38789395
did you create the index to go with it?
0
 

Author Comment

by:sam2929
ID: 38790017
yes
0
 
LVL 6

Accepted Solution

by:
Javier Morales earned 500 total points
ID: 38792428
Well, looks like these two tables fit to the "cluster segment" scenario.

At a glance, if doc_num, doc_year and company_tcd are the main "primary/foreign key" of the join, recovering the entire rows from both tables combined will take that much surely. 50 millions crossed with 100 millions with outer join means what explain plan is approaching, with an additional table in between that makes it harder to tune.

I mean, considering the following volumes:

1.- table S_TXN_HDR_ACT recovers 31Million (estimated)
2.- table S_TXN_ACTUAL recovers 92Million (estimated)
crossing them by outer join results 92 million rows (step 3 of the plan).
3.- for each result, crossing with S_CONTRACT would result 365 rows each access, and a total estimated 339 million rows.

sdstuber guessed that a correlation subquery to make these "365 rows" estimated in memory would be better solved like a "hash join", linking the number of rows from S_CONTRACT that matches with that 92millions row set, and it should have work. Why it didn't?

Well, my question is:
Does table B has CONTRACT_NUM column? Why joining (by outer join) table C with A about CONTRACT_NUM? does not table A,B and C should match the same CONTRACT_NUM?

Oracle CBO does not guess it, and assumes the following (because maybe that was your intention coding it that way) that:

Table A and table B join with two collumns. DOC_YEAR and DOC_NUM by outer join
(maybe table B does not have CONTRACT_NUM)
and Table A and table C join with CONTRACT_NUM column, by outer join too, and nothing ensures that DOC_NUM and DOC_YEAR in the same CONTRACT_NUM do match, and so, the CBO considers this path of join as the optimal way to get the rows.

If table B does have the column CONTRACT_NUM (or, if it's not, a DOC_YEAR and DOC_NUM belongs to the same CONTRACT_NUM, please add the column to table B), please include it in the join clauses, to make:

Table A joins to table B by
"A"."DOC_YEAR"="B"."DOC_YEAR"
AND "A"."DOC_NUM"="B"."DOC_NUM"
AND "A"."CONTRACT_NUM"="B"."CONTRACT_NUM"

and Table A/B joins to table C by
(TRIM("A"."CONTRACT_NUM")=TRIM("C"."CONTRACT_NUM"))

Are you using FUNCTION BASED INDEXES for column TRIM(CONTRACT_NUM) ??
if not, that's why Oracle ignore indexes based on CONTRACT_NUM column.

And, after that, if joining all that million rows isn't fast enough, you should consider to create a INDEXED CLUSTER with S_TXN_HDR_ACT and S_TXN_ACTUAL, where DOC_NUM, DOC_YEAR (and probably CONTRACT_NUM) be the cluster key to "join them physically" and make the joined rows come out immediately.

Take a look at it:
Oracle Doc. about Indexed Clusters.

Hope this helps, kind regards,
Javier
0
 
LVL 15

Expert Comment

by:Franck Pachot
ID: 38793603
Hi,

To summarize, the possibility to improbe the join over million of rows are:
- cluster (as given by Javier) but difficult for such big tables where you don't know in advance  the size in front of each key.
- partitions. Ok it's an option to by, but you can't manage such big tables at free
- maybe IOT having the unused columns in overflow segment. In case Oracle can do a merge join without having to sort.

well... any way to have rows physically grouped on the partition key.

Did you tried a large hash_area_size in workarea_size_policy=manual ? Hash join may be ok if you don't have to do several passes to tempfiles. Parallel query may help as well.

Now, what is the requirement behind that ? what do you do with that huge amount of data ?

Regards,
Franck.
0

Featured Post

PRTG Network Monitor: Intuitive Network Monitoring

Network Monitoring is essential to ensure that computer systems and network devices are running. Use PRTG to monitor LANs, servers, websites, applications and devices, bandwidth, virtual environments, remote systems, IoT, and many more. PRTG is easy to set up & use.

Join & Write a Comment

Suggested Solutions

Article by: Swadhin
From the Oracle SQL Reference (http://download.oracle.com/docs/cd/B19306_01/server.102/b14200/queries006.htm) we are told that a join is a query that combines rows from two or more tables, views, or materialized views. This article provides a glimps…
Truncate is a DDL Command where as Delete is a DML Command. Both will delete data from table, but what is the difference between these below statements truncate table <table_name> ?? delete from <table_name> ?? The first command cannot be …
This video explains at a high level about the four available data types in Oracle and how dates can be manipulated by the user to get data into and out of the database.
This video shows how to Export data from an Oracle database using the Original Export Utility.  The corresponding Import utility, which works the same way is referenced, but not demonstrated.

757 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

23 Experts available now in Live!

Get 1:1 Help Now