Link to home
Start Free TrialLog in
Avatar of sam2929
sam2929

asked on

select taking too long

Hi,
I have 2 tables with 100 million and 50 million rows and its taking forver to select below
is the explain plan please suggest what i can do to increase performance.


Plan hash value: 3589824144

------------------------------------------------------------------------------------------------
| Id  | Operation              | Name          | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |               |       |       |       |  2105K(100)|          |
|*  1 |  HASH JOIN RIGHT OUTER |               |   339M|   137G|       |  2105K  (2)| 07:01:06 |
|   2 |   TABLE ACCESS FULL    | S_CONTRACT    |   365 | 34675 |       |     4   (0)| 00:00:01 |
|*  3 |   HASH JOIN RIGHT OUTER|               |    92M|    29G|  2811M|  2101K  (2)| 07:00:22 |
|   4 |    TABLE ACCESS FULL   | S_TXN_HDR_ACT |    31M|  2448M|       | 89953   (3)| 00:18:00 |
|   5 |    TABLE ACCESS FULL   | S_TXN_ACTUAL  |    92M|    22G|       |   682K  (3)| 02:16:33 |
------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access(TRIM("A"."CONTRACT_NUM")=TRIM("C"."CONTRACT_NUM"))
   3 - access("A"."DOC_YEAR"="B"."DOC_YEAR" AND "A"."DOC_NUM"="B"."DOC_NUM")
Avatar of slightwv (䄆 Netminder)
slightwv (䄆 Netminder)

Indexes on DOC_YEAR and DOC_NUM.

Possibly a function based index on: TRIM("A"."CONTRACT_NUM") and TRIM("C"."CONTRACT_NUM")
Avatar of sam2929

ASKER

Indexes are already there
what are your indexes?  are they normal indexes or the function-based ones suggested by slightwv?

from the plan you can see they aren't being used.  What is the clustering factor on them?

What is the actual query?
Are the statistics gathered ?
Sorry, wrong question : I see the number of records mentioned.
>>Are the statistics gathered ?

Corrected question:  Are the up to date?
Avatar of sam2929

ASKER

>>Are the statistics gathered ?

 Yes
Avatar of sam2929

ASKER

Will composite index help?
>>Will composite index help?

Probably not if the regular indexes will not unless both columns together produce enough cardinality.

Can you create a representative sample of your data in a test system that reproduces the fullt table scans that you have?  Then you can test a composit index.
Hi,

I'll suppose the estimated statistics are the right ones (Rows column).

So you join two big tables (about 20GB with a 2GB table)
The execution plan is the right one for that: the smaller table is hashed and the join is done by probing this hashed table while reading the large table.

But 2GB does not fit in memory so you probably see a lot of i/o to and from tempfiles.

Indexes will not help when you get millions of rows.

If you can partition both tables on DOC_YEAR then Oracle can do partition-wise joins: joining on smaller chunks that fit in memory. I suppose you have several years. Or you can subpartition on hash (DOC_NUM).

Then once partitioned you may improve even more the response time by using parallel query. You can try parallel query even when not partitioned, but you will have 2GB to send among processes...

If you can't change the design, maybe use manual workareas with a very big hash_area_size, but that is not easy. Or put tempfiles on the fastest disks you have.

But partitioning is the key to scalability here.

Regards,
Franck.
Not necessarily.  The full table scan is due to the function, so until that special index is created,  and used, you will not see much improvement.

Perhaps obvious, perhaps already addressed:
--  reduce the volume of data by moving anything stale (non essential) to partitions or archive storage
--  look into moving the BIG files to BIG tablespaces, where the I/O grab is optimized
--  look into making your query read-only,  or have the database up in noarchivelog mode
--  increase your session ARRAYSIZE parameter
-- increase your session SORT_AREA_SIZE system parameter

HTH
I agree partioning might help if your data/system lends itself to it.

I just wanted to add that partitioning is not free.  You must purchase it.
what is the query?  what are the actual indexes you have in place?
Avatar of sam2929

ASKER

I will try doing partition by year can you please tell me who can i do partition by year

CREATE TABLE "AA"
   (      "CLIENT_NUM" VARCHAR2(4 BYTE),
      "COMPANY_TCD" VARCHAR2(4 BYTE),
      "DOC_NUM" VARCHAR2(10 BYTE),
      "DOC_YEAR" NUMBER(4,0),
      "LINE_ITEM" NUMBER(3,0)
   )
Check the docs.  They have examples.

I would first look at a range partition (just use your number and don't worry about converting it to a date):
http://docs.oracle.com/cd/E11882_01/server.112/e25789/schemaob.htm#CNCPT88862

You might also look at a list partition.
PLEASE provide some more details about your query and your existing indexes


From the plan we can deduce it looks something like this....


SELECT *
  FROM s_txn_hdr_act a, s_txn_actual b, s_contract c
 WHERE     a.doc_year = b.doc_year
       AND a.doc_num = b.doc_num
       AND TRIM(a.contract_num) = TRIM(c.contract_num)

is contract_num distinct for s_contract?
is *  correct? If not, what columns are you actually pulling from each table?

again, what indexes do you have.  Partitioning may not be needed at all.


are you testing the real query, or are you testing a query that will be put into a view?

I ask this because you have no conditions on doc_year, which says to me you're pulling all records for all time.    Unless your data is all new, most people aren't generally interested in all of it, only recent stuff.
Avatar of sam2929

ASKER

Stuber : i wil paste full sql and indexes in few minutes

doing partition iike below as doc_year is numeric its not liking it

PARTITION BY RANGE (DOC_YEAR)
(PARTITION DOC_YEAR_2010 VALUES LESS THAN (2010)  
 );
>>doing partition iike below as doc_year is numeric its not liking it

It should work.

There are examples here:
http://www.orafaq.com/wiki/Range_partitioning

What is the error?  Did you install partitioning?
Avatar of sam2929

ASKER

Indexes on table are
ALTER TABLE "S_TXN_ACTUAL" ADD CONSTRAINT "S_TXN_ACTUAL_PK" PRIMARY KEY ("CLIENT_NUM", "COMPANY_TCD", "DOC_NUM", "DOC_YEAR", "LINE_ITEM")
ALTER TABLE  "S_TXN_HDR_ACT" ADD CONSTRAINT "S_TXN_HDR_ACT_PK" PRIMARY KEY ("CLIENT_NUM", "COMPANY_TCD", "DOC_NUM", "DOC_YEAR")
query.txt
Remember that partitioning allows non-relevant data to be filtered out, or masked.  So IF you have the feature installed, moving data into partitions will involve a database model change, and more fun stuff down stream.

I suggest we step back to the point about creating a functional index on the column being trimmed.
Avatar of sam2929

ASKER

Error at Command Line:3 Column:0
Error report:
SQL Error: ORA-00439: feature not enabled: Partitioning
00439. 00000 -  "feature not enabled: %s"
*Cause:    The specified feature is not enabled.
*Action:   Do not attempt to use this feature.

Any idea who can i enable it?
what about s_contract?  is trim(contract_num ) a unique key for that table?
>>Any idea who can i enable it?

Let's hold off on partitioning until we determine if simple indexes will solve your problem.

Simple answer: It needs to be installed with the server software then selected as an option when you build the database.  I'm sure it can be 'activated' after database creation but I would have to look it up in the docs.

Also remember, it is not free.  If you haven't paid to use it, you should not...
Avatar of sam2929

ASKER

yes
CREATE UNIQUE INDEX "FINSTAGEADM"."S_CONTRACT_PK" ON "FINSTAGEADM"."S_CONTRACT" ("CONTRACT_NUM")
try this...

create unique index  "FINSTAGEADM"."S_CONTRACT_UK_FTRIM" ON "FINSTAGEADM"."S_CONTRACT" (TRIM("CONTRACT_NUM"))

and rewrite the query like this...
I know, I'm ignoring your large tables and addressing the small one, but if you're hitting that small one many times for results on the large ones this could help a lot.



SELECT TRIM(a.company_tcd),
       TRIM(a.doc_num),
       a.doc_year,
       a.line_item,
       TRIM(a.deb_cred_ind),
       a.amount,
       TRIM(a.re_code),
       TRIM(a.text),
       TRIM(a.payment_terms),
       TRIM(a.gl_acct_tcd),
       TRIM(a.valuation_type),
       TRIM(a.assignment),
       a.quantity,
       TRIM(a.unit_of_msr_tcd),
       TRIM(a.tax_jurist),
       TRIM(a.busn_area_tcd),
       a.clrng_date,
       a.clrng_ent_date,
       a.clrng_document,
       TRIM(a.tax_cd),
       TRIM(a.purch_doc_tcd),
       TRIM(a.pdoc_itm_num),
       TRIM(a.billng_doc),
       TRIM(a.valuatn_area),
       TRIM(a.paymnt_ref),
       TRIM(a.ref_key_1),
       TRIM(a.ref_key_2),
       TRIM(a.ref_key_3),
       TRIM(a.sales_doc),
       TRIM(a.sales_doc_itm),
       TRIM(a.txn_type),
       TRIM(a.prtnr_busn_area),
       TRIM(a.prtnr_profit_cntr),
       TRIM(a.profit_cntr_tcd),
       TRIM(a.post_key),
       TRIM(a.post_cat_tcd),
       TRIM(a.post_cat_desc),
       TRIM(a.post_typ_tcd),
       TRIM(a.post_typ_desc),
       TRIM(a.actv_typ_tcd),
       TRIM(a.personnel_num),
       TRIM(a.order_tcd),
       TRIM(a.cost_cntr_tcd),
       TRIM(a.post_stat),
       TRIM(b.doc_type),
       b.post_date,
       b.post_period,
       TRIM(b.REF),
       TRIM(b.ref_key),
       TRIM(b.doc_hdr_text),
       TRIM(b.revrsd_with),
       TRIM(b.revrsd_rsn),
       TRIM(b.doc_status),
       TRIM(b.doc_status_desc),
       TRIM(b.doc_type_desc),
       TRIM(b.revrsd_rsn_desc),
       TRIM(b.doc_year),
       TRIM(a.asset_txn_type),
       a.cheque_num,
       (SELECT TRIM(c.contract_desc)
          FROM s_contract c
         WHERE TRIM(a.contract_num) = TRIM(c.contract_num)),
       TRIM(a.contract_num),
       TRIM(a.post_key_desc),
       TRIM(a.ref_org_unit),
       TRIM(a.tax_cd_desc),
       TRIM(a.asset_num_tcd),
       TRIM(a.customer_tcd),
       TRIM(a.material_tcd),
       TRIM(a.plant_tcd),
       TRIM(a.project_tcd),
       TRIM(a.vendor_tcd),
       a.network_tcd,
       a.ps_activity_tcd,
       a.wbs_elem_tcd,
       TRIM(a.building_tcd),
       TRIM(a.busn_entity_tcd),
       TRIM(a.property_tcd),
       TRIM(a.re_code),
       TRIM(a.rental_unit_tcd),
       TRIM(a.contrllng_area)
  FROM s_txn_actual a
       LEFT OUTER JOIN s_txn_hdr_act b
           ON a.doc_num = b.doc_num AND a.doc_year = b.doc_year AND a.company_tcd = b.company_tcd
Hi,

>> SQL Error: ORA-00439: feature not enabled: Partitioning
So you don't have partitioning option. Maybe you are on Standard Edition.

I don't see how indexes can help to get millions of rows. But maybe the estimated cardinality is wrong. So let's check the right numbers and exactly where the bottleneck is.

Please run:
alter session set statistics_level=all;

Open in new window

then your query
then:
select * from table(dbms_xplan.display_cursor('','','ALLSTATS LAST'))

Open in new window


That will show the actual execution plan statistics.

thanks,
Franck.
Avatar of sam2929

ASKER

Problem is here

if i do select select count(*) from S_TXN_ACTUAL takes 2-3 minutes
Now if i do below count taking forever

select count(*) FROM s_txn_actual a
       LEFT OUTER JOIN s_txn_hdr_act b
           ON a.doc_num = b.doc_num AND a.doc_year = b.doc_year AND a.company_tcd = b.company_tcd
Avatar of sam2929

ASKER

The query which you send me is even taking more longer then original one
did you create the index to go with it?
Avatar of sam2929

ASKER

yes
ASKER CERTIFIED SOLUTION
Avatar of Javier Morales
Javier Morales
Flag of Spain image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Hi,

To summarize, the possibility to improbe the join over million of rows are:
- cluster (as given by Javier) but difficult for such big tables where you don't know in advance  the size in front of each key.
- partitions. Ok it's an option to by, but you can't manage such big tables at free
- maybe IOT having the unused columns in overflow segment. In case Oracle can do a merge join without having to sort.

well... any way to have rows physically grouped on the partition key.

Did you tried a large hash_area_size in workarea_size_policy=manual ? Hash join may be ok if you don't have to do several passes to tempfiles. Parallel query may help as well.

Now, what is the requirement behind that ? what do you do with that huge amount of data ?

Regards,
Franck.