sam2929
asked on
select taking too long
Hi,
I have 2 tables with 100 million and 50 million rows and its taking forver to select below
is the explain plan please suggest what i can do to increase performance.
Plan hash value: 3589824144
-------------------------- ---------- ---------- ---------- ---------- ---------- ---------- ----------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
-------------------------- ---------- ---------- ---------- ---------- ---------- ---------- ----------
| 0 | SELECT STATEMENT | | | | | 2105K(100)| |
|* 1 | HASH JOIN RIGHT OUTER | | 339M| 137G| | 2105K (2)| 07:01:06 |
| 2 | TABLE ACCESS FULL | S_CONTRACT | 365 | 34675 | | 4 (0)| 00:00:01 |
|* 3 | HASH JOIN RIGHT OUTER| | 92M| 29G| 2811M| 2101K (2)| 07:00:22 |
| 4 | TABLE ACCESS FULL | S_TXN_HDR_ACT | 31M| 2448M| | 89953 (3)| 00:18:00 |
| 5 | TABLE ACCESS FULL | S_TXN_ACTUAL | 92M| 22G| | 682K (3)| 02:16:33 |
-------------------------- ---------- ---------- ---------- ---------- ---------- ---------- ----------
Predicate Information (identified by operation id):
-------------------------- ---------- ---------- -----
1 - access(TRIM("A"."CONTRACT_ NUM")=TRIM ("C"."CONT RACT_NUM") )
3 - access("A"."DOC_YEAR"="B". "DOC_YEAR" AND "A"."DOC_NUM"="B"."DOC_NUM ")
I have 2 tables with 100 million and 50 million rows and its taking forver to select below
is the explain plan please suggest what i can do to increase performance.
Plan hash value: 3589824144
--------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
--------------------------
| 0 | SELECT STATEMENT | | | | | 2105K(100)| |
|* 1 | HASH JOIN RIGHT OUTER | | 339M| 137G| | 2105K (2)| 07:01:06 |
| 2 | TABLE ACCESS FULL | S_CONTRACT | 365 | 34675 | | 4 (0)| 00:00:01 |
|* 3 | HASH JOIN RIGHT OUTER| | 92M| 29G| 2811M| 2101K (2)| 07:00:22 |
| 4 | TABLE ACCESS FULL | S_TXN_HDR_ACT | 31M| 2448M| | 89953 (3)| 00:18:00 |
| 5 | TABLE ACCESS FULL | S_TXN_ACTUAL | 92M| 22G| | 682K (3)| 02:16:33 |
--------------------------
Predicate Information (identified by operation id):
--------------------------
1 - access(TRIM("A"."CONTRACT_
3 - access("A"."DOC_YEAR"="B".
ASKER
Indexes are already there
what are your indexes? are they normal indexes or the function-based ones suggested by slightwv?
from the plan you can see they aren't being used. What is the clustering factor on them?
What is the actual query?
from the plan you can see they aren't being used. What is the clustering factor on them?
What is the actual query?
Are the statistics gathered ?
Sorry, wrong question : I see the number of records mentioned.
Sorry, wrong question : I see the number of records mentioned.
>>Are the statistics gathered ?
Corrected question: Are the up to date?
Corrected question: Are the up to date?
ASKER
>>Are the statistics gathered ?
Yes
Yes
ASKER
Will composite index help?
>>Will composite index help?
Probably not if the regular indexes will not unless both columns together produce enough cardinality.
Can you create a representative sample of your data in a test system that reproduces the fullt table scans that you have? Then you can test a composit index.
Probably not if the regular indexes will not unless both columns together produce enough cardinality.
Can you create a representative sample of your data in a test system that reproduces the fullt table scans that you have? Then you can test a composit index.
Hi,
I'll suppose the estimated statistics are the right ones (Rows column).
So you join two big tables (about 20GB with a 2GB table)
The execution plan is the right one for that: the smaller table is hashed and the join is done by probing this hashed table while reading the large table.
But 2GB does not fit in memory so you probably see a lot of i/o to and from tempfiles.
Indexes will not help when you get millions of rows.
If you can partition both tables on DOC_YEAR then Oracle can do partition-wise joins: joining on smaller chunks that fit in memory. I suppose you have several years. Or you can subpartition on hash (DOC_NUM).
Then once partitioned you may improve even more the response time by using parallel query. You can try parallel query even when not partitioned, but you will have 2GB to send among processes...
If you can't change the design, maybe use manual workareas with a very big hash_area_size, but that is not easy. Or put tempfiles on the fastest disks you have.
But partitioning is the key to scalability here.
Regards,
Franck.
I'll suppose the estimated statistics are the right ones (Rows column).
So you join two big tables (about 20GB with a 2GB table)
The execution plan is the right one for that: the smaller table is hashed and the join is done by probing this hashed table while reading the large table.
But 2GB does not fit in memory so you probably see a lot of i/o to and from tempfiles.
Indexes will not help when you get millions of rows.
If you can partition both tables on DOC_YEAR then Oracle can do partition-wise joins: joining on smaller chunks that fit in memory. I suppose you have several years. Or you can subpartition on hash (DOC_NUM).
Then once partitioned you may improve even more the response time by using parallel query. You can try parallel query even when not partitioned, but you will have 2GB to send among processes...
If you can't change the design, maybe use manual workareas with a very big hash_area_size, but that is not easy. Or put tempfiles on the fastest disks you have.
But partitioning is the key to scalability here.
Regards,
Franck.
Not necessarily. The full table scan is due to the function, so until that special index is created, and used, you will not see much improvement.
Perhaps obvious, perhaps already addressed:
-- reduce the volume of data by moving anything stale (non essential) to partitions or archive storage
-- look into moving the BIG files to BIG tablespaces, where the I/O grab is optimized
-- look into making your query read-only, or have the database up in noarchivelog mode
-- increase your session ARRAYSIZE parameter
-- increase your session SORT_AREA_SIZE system parameter
HTH
Perhaps obvious, perhaps already addressed:
-- reduce the volume of data by moving anything stale (non essential) to partitions or archive storage
-- look into moving the BIG files to BIG tablespaces, where the I/O grab is optimized
-- look into making your query read-only, or have the database up in noarchivelog mode
-- increase your session ARRAYSIZE parameter
-- increase your session SORT_AREA_SIZE system parameter
HTH
I agree partioning might help if your data/system lends itself to it.
I just wanted to add that partitioning is not free. You must purchase it.
I just wanted to add that partitioning is not free. You must purchase it.
what is the query? what are the actual indexes you have in place?
ASKER
I will try doing partition by year can you please tell me who can i do partition by year
CREATE TABLE "AA"
( "CLIENT_NUM" VARCHAR2(4 BYTE),
"COMPANY_TCD" VARCHAR2(4 BYTE),
"DOC_NUM" VARCHAR2(10 BYTE),
"DOC_YEAR" NUMBER(4,0),
"LINE_ITEM" NUMBER(3,0)
)
CREATE TABLE "AA"
( "CLIENT_NUM" VARCHAR2(4 BYTE),
"COMPANY_TCD" VARCHAR2(4 BYTE),
"DOC_NUM" VARCHAR2(10 BYTE),
"DOC_YEAR" NUMBER(4,0),
"LINE_ITEM" NUMBER(3,0)
)
Check the docs. They have examples.
I would first look at a range partition (just use your number and don't worry about converting it to a date):
http://docs.oracle.com/cd/E11882_01/server.112/e25789/schemaob.htm#CNCPT88862
You might also look at a list partition.
I would first look at a range partition (just use your number and don't worry about converting it to a date):
http://docs.oracle.com/cd/E11882_01/server.112/e25789/schemaob.htm#CNCPT88862
You might also look at a list partition.
PLEASE provide some more details about your query and your existing indexes
From the plan we can deduce it looks something like this....
SELECT *
FROM s_txn_hdr_act a, s_txn_actual b, s_contract c
WHERE a.doc_year = b.doc_year
AND a.doc_num = b.doc_num
AND TRIM(a.contract_num) = TRIM(c.contract_num)
is contract_num distinct for s_contract?
is * correct? If not, what columns are you actually pulling from each table?
again, what indexes do you have. Partitioning may not be needed at all.
are you testing the real query, or are you testing a query that will be put into a view?
I ask this because you have no conditions on doc_year, which says to me you're pulling all records for all time. Unless your data is all new, most people aren't generally interested in all of it, only recent stuff.
From the plan we can deduce it looks something like this....
SELECT *
FROM s_txn_hdr_act a, s_txn_actual b, s_contract c
WHERE a.doc_year = b.doc_year
AND a.doc_num = b.doc_num
AND TRIM(a.contract_num) = TRIM(c.contract_num)
is contract_num distinct for s_contract?
is * correct? If not, what columns are you actually pulling from each table?
again, what indexes do you have. Partitioning may not be needed at all.
are you testing the real query, or are you testing a query that will be put into a view?
I ask this because you have no conditions on doc_year, which says to me you're pulling all records for all time. Unless your data is all new, most people aren't generally interested in all of it, only recent stuff.
ASKER
Stuber : i wil paste full sql and indexes in few minutes
doing partition iike below as doc_year is numeric its not liking it
PARTITION BY RANGE (DOC_YEAR)
(PARTITION DOC_YEAR_2010 VALUES LESS THAN (2010)
);
doing partition iike below as doc_year is numeric its not liking it
PARTITION BY RANGE (DOC_YEAR)
(PARTITION DOC_YEAR_2010 VALUES LESS THAN (2010)
);
>>doing partition iike below as doc_year is numeric its not liking it
It should work.
There are examples here:
http://www.orafaq.com/wiki/Range_partitioning
What is the error? Did you install partitioning?
It should work.
There are examples here:
http://www.orafaq.com/wiki/Range_partitioning
What is the error? Did you install partitioning?
ASKER
Indexes on table are
ALTER TABLE "S_TXN_ACTUAL" ADD CONSTRAINT "S_TXN_ACTUAL_PK" PRIMARY KEY ("CLIENT_NUM", "COMPANY_TCD", "DOC_NUM", "DOC_YEAR", "LINE_ITEM")
ALTER TABLE "S_TXN_HDR_ACT" ADD CONSTRAINT "S_TXN_HDR_ACT_PK" PRIMARY KEY ("CLIENT_NUM", "COMPANY_TCD", "DOC_NUM", "DOC_YEAR")
query.txt
ALTER TABLE "S_TXN_ACTUAL" ADD CONSTRAINT "S_TXN_ACTUAL_PK" PRIMARY KEY ("CLIENT_NUM", "COMPANY_TCD", "DOC_NUM", "DOC_YEAR", "LINE_ITEM")
ALTER TABLE "S_TXN_HDR_ACT" ADD CONSTRAINT "S_TXN_HDR_ACT_PK" PRIMARY KEY ("CLIENT_NUM", "COMPANY_TCD", "DOC_NUM", "DOC_YEAR")
query.txt
Remember that partitioning allows non-relevant data to be filtered out, or masked. So IF you have the feature installed, moving data into partitions will involve a database model change, and more fun stuff down stream.
I suggest we step back to the point about creating a functional index on the column being trimmed.
I suggest we step back to the point about creating a functional index on the column being trimmed.
ASKER
Error at Command Line:3 Column:0
Error report:
SQL Error: ORA-00439: feature not enabled: Partitioning
00439. 00000 - "feature not enabled: %s"
*Cause: The specified feature is not enabled.
*Action: Do not attempt to use this feature.
Any idea who can i enable it?
Error report:
SQL Error: ORA-00439: feature not enabled: Partitioning
00439. 00000 - "feature not enabled: %s"
*Cause: The specified feature is not enabled.
*Action: Do not attempt to use this feature.
Any idea who can i enable it?
what about s_contract? is trim(contract_num ) a unique key for that table?
>>Any idea who can i enable it?
Let's hold off on partitioning until we determine if simple indexes will solve your problem.
Simple answer: It needs to be installed with the server software then selected as an option when you build the database. I'm sure it can be 'activated' after database creation but I would have to look it up in the docs.
Also remember, it is not free. If you haven't paid to use it, you should not...
Let's hold off on partitioning until we determine if simple indexes will solve your problem.
Simple answer: It needs to be installed with the server software then selected as an option when you build the database. I'm sure it can be 'activated' after database creation but I would have to look it up in the docs.
Also remember, it is not free. If you haven't paid to use it, you should not...
ASKER
yes
CREATE UNIQUE INDEX "FINSTAGEADM"."S_CONTRACT_ PK" ON "FINSTAGEADM"."S_CONTRACT" ("CONTRACT_NUM")
CREATE UNIQUE INDEX "FINSTAGEADM"."S_CONTRACT_
try this...
create unique index "FINSTAGEADM"."S_CONTRACT_ UK_FTRIM" ON "FINSTAGEADM"."S_CONTRACT" (TRIM("CONTRACT_NUM"))
and rewrite the query like this...
I know, I'm ignoring your large tables and addressing the small one, but if you're hitting that small one many times for results on the large ones this could help a lot.
SELECT TRIM(a.company_tcd),
TRIM(a.doc_num),
a.doc_year,
a.line_item,
TRIM(a.deb_cred_ind),
a.amount,
TRIM(a.re_code),
TRIM(a.text),
TRIM(a.payment_terms),
TRIM(a.gl_acct_tcd),
TRIM(a.valuation_type),
TRIM(a.assignment),
a.quantity,
TRIM(a.unit_of_msr_tcd),
TRIM(a.tax_jurist),
TRIM(a.busn_area_tcd),
a.clrng_date,
a.clrng_ent_date,
a.clrng_document,
TRIM(a.tax_cd),
TRIM(a.purch_doc_tcd),
TRIM(a.pdoc_itm_num),
TRIM(a.billng_doc),
TRIM(a.valuatn_area),
TRIM(a.paymnt_ref),
TRIM(a.ref_key_1),
TRIM(a.ref_key_2),
TRIM(a.ref_key_3),
TRIM(a.sales_doc),
TRIM(a.sales_doc_itm),
TRIM(a.txn_type),
TRIM(a.prtnr_busn_area),
TRIM(a.prtnr_profit_cntr),
TRIM(a.profit_cntr_tcd),
TRIM(a.post_key),
TRIM(a.post_cat_tcd),
TRIM(a.post_cat_desc),
TRIM(a.post_typ_tcd),
TRIM(a.post_typ_desc),
TRIM(a.actv_typ_tcd),
TRIM(a.personnel_num),
TRIM(a.order_tcd),
TRIM(a.cost_cntr_tcd),
TRIM(a.post_stat),
TRIM(b.doc_type),
b.post_date,
b.post_period,
TRIM(b.REF),
TRIM(b.ref_key),
TRIM(b.doc_hdr_text),
TRIM(b.revrsd_with),
TRIM(b.revrsd_rsn),
TRIM(b.doc_status),
TRIM(b.doc_status_desc),
TRIM(b.doc_type_desc),
TRIM(b.revrsd_rsn_desc),
TRIM(b.doc_year),
TRIM(a.asset_txn_type),
a.cheque_num,
(SELECT TRIM(c.contract_desc)
FROM s_contract c
WHERE TRIM(a.contract_num) = TRIM(c.contract_num)),
TRIM(a.contract_num),
TRIM(a.post_key_desc),
TRIM(a.ref_org_unit),
TRIM(a.tax_cd_desc),
TRIM(a.asset_num_tcd),
TRIM(a.customer_tcd),
TRIM(a.material_tcd),
TRIM(a.plant_tcd),
TRIM(a.project_tcd),
TRIM(a.vendor_tcd),
a.network_tcd,
a.ps_activity_tcd,
a.wbs_elem_tcd,
TRIM(a.building_tcd),
TRIM(a.busn_entity_tcd),
TRIM(a.property_tcd),
TRIM(a.re_code),
TRIM(a.rental_unit_tcd),
TRIM(a.contrllng_area)
FROM s_txn_actual a
LEFT OUTER JOIN s_txn_hdr_act b
ON a.doc_num = b.doc_num AND a.doc_year = b.doc_year AND a.company_tcd = b.company_tcd
create unique index "FINSTAGEADM"."S_CONTRACT_
and rewrite the query like this...
I know, I'm ignoring your large tables and addressing the small one, but if you're hitting that small one many times for results on the large ones this could help a lot.
SELECT TRIM(a.company_tcd),
TRIM(a.doc_num),
a.doc_year,
a.line_item,
TRIM(a.deb_cred_ind),
a.amount,
TRIM(a.re_code),
TRIM(a.text),
TRIM(a.payment_terms),
TRIM(a.gl_acct_tcd),
TRIM(a.valuation_type),
TRIM(a.assignment),
a.quantity,
TRIM(a.unit_of_msr_tcd),
TRIM(a.tax_jurist),
TRIM(a.busn_area_tcd),
a.clrng_date,
a.clrng_ent_date,
a.clrng_document,
TRIM(a.tax_cd),
TRIM(a.purch_doc_tcd),
TRIM(a.pdoc_itm_num),
TRIM(a.billng_doc),
TRIM(a.valuatn_area),
TRIM(a.paymnt_ref),
TRIM(a.ref_key_1),
TRIM(a.ref_key_2),
TRIM(a.ref_key_3),
TRIM(a.sales_doc),
TRIM(a.sales_doc_itm),
TRIM(a.txn_type),
TRIM(a.prtnr_busn_area),
TRIM(a.prtnr_profit_cntr),
TRIM(a.profit_cntr_tcd),
TRIM(a.post_key),
TRIM(a.post_cat_tcd),
TRIM(a.post_cat_desc),
TRIM(a.post_typ_tcd),
TRIM(a.post_typ_desc),
TRIM(a.actv_typ_tcd),
TRIM(a.personnel_num),
TRIM(a.order_tcd),
TRIM(a.cost_cntr_tcd),
TRIM(a.post_stat),
TRIM(b.doc_type),
b.post_date,
b.post_period,
TRIM(b.REF),
TRIM(b.ref_key),
TRIM(b.doc_hdr_text),
TRIM(b.revrsd_with),
TRIM(b.revrsd_rsn),
TRIM(b.doc_status),
TRIM(b.doc_status_desc),
TRIM(b.doc_type_desc),
TRIM(b.revrsd_rsn_desc),
TRIM(b.doc_year),
TRIM(a.asset_txn_type),
a.cheque_num,
(SELECT TRIM(c.contract_desc)
FROM s_contract c
WHERE TRIM(a.contract_num) = TRIM(c.contract_num)),
TRIM(a.contract_num),
TRIM(a.post_key_desc),
TRIM(a.ref_org_unit),
TRIM(a.tax_cd_desc),
TRIM(a.asset_num_tcd),
TRIM(a.customer_tcd),
TRIM(a.material_tcd),
TRIM(a.plant_tcd),
TRIM(a.project_tcd),
TRIM(a.vendor_tcd),
a.network_tcd,
a.ps_activity_tcd,
a.wbs_elem_tcd,
TRIM(a.building_tcd),
TRIM(a.busn_entity_tcd),
TRIM(a.property_tcd),
TRIM(a.re_code),
TRIM(a.rental_unit_tcd),
TRIM(a.contrllng_area)
FROM s_txn_actual a
LEFT OUTER JOIN s_txn_hdr_act b
ON a.doc_num = b.doc_num AND a.doc_year = b.doc_year AND a.company_tcd = b.company_tcd
Hi,
>> SQL Error: ORA-00439: feature not enabled: Partitioning
So you don't have partitioning option. Maybe you are on Standard Edition.
I don't see how indexes can help to get millions of rows. But maybe the estimated cardinality is wrong. So let's check the right numbers and exactly where the bottleneck is.
Please run:
then:
That will show the actual execution plan statistics.
thanks,
Franck.
>> SQL Error: ORA-00439: feature not enabled: Partitioning
So you don't have partitioning option. Maybe you are on Standard Edition.
I don't see how indexes can help to get millions of rows. But maybe the estimated cardinality is wrong. So let's check the right numbers and exactly where the bottleneck is.
Please run:
alter session set statistics_level=all;
then your querythen:
select * from table(dbms_xplan.display_cursor('','','ALLSTATS LAST'))
That will show the actual execution plan statistics.
thanks,
Franck.
ASKER
Problem is here
if i do select select count(*) from S_TXN_ACTUAL takes 2-3 minutes
Now if i do below count taking forever
select count(*) FROM s_txn_actual a
LEFT OUTER JOIN s_txn_hdr_act b
ON a.doc_num = b.doc_num AND a.doc_year = b.doc_year AND a.company_tcd = b.company_tcd
if i do select select count(*) from S_TXN_ACTUAL takes 2-3 minutes
Now if i do below count taking forever
select count(*) FROM s_txn_actual a
LEFT OUTER JOIN s_txn_hdr_act b
ON a.doc_num = b.doc_num AND a.doc_year = b.doc_year AND a.company_tcd = b.company_tcd
ASKER
The query which you send me is even taking more longer then original one
did you create the index to go with it?
ASKER
yes
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Hi,
To summarize, the possibility to improbe the join over million of rows are:
- cluster (as given by Javier) but difficult for such big tables where you don't know in advance the size in front of each key.
- partitions. Ok it's an option to by, but you can't manage such big tables at free
- maybe IOT having the unused columns in overflow segment. In case Oracle can do a merge join without having to sort.
well... any way to have rows physically grouped on the partition key.
Did you tried a large hash_area_size in workarea_size_policy=manua l ? Hash join may be ok if you don't have to do several passes to tempfiles. Parallel query may help as well.
Now, what is the requirement behind that ? what do you do with that huge amount of data ?
Regards,
Franck.
To summarize, the possibility to improbe the join over million of rows are:
- cluster (as given by Javier) but difficult for such big tables where you don't know in advance the size in front of each key.
- partitions. Ok it's an option to by, but you can't manage such big tables at free
- maybe IOT having the unused columns in overflow segment. In case Oracle can do a merge join without having to sort.
well... any way to have rows physically grouped on the partition key.
Did you tried a large hash_area_size in workarea_size_policy=manua
Now, what is the requirement behind that ? what do you do with that huge amount of data ?
Regards,
Franck.
Possibly a function based index on: TRIM("A"."CONTRACT_NUM") and TRIM("C"."CONTRACT_NUM")