asked on

Information about CBO

I have a select query which currently has the RULE hint. It runs fast. I am trying to move to the CBO and
I have run dbms_stats on the tables in this query.
With the RULE hint, the optimiser uses indexes. Without the RULE hint ( and so presumably using the statistics) it is doing 2 full table scans.
Anyone any ideas? Or a stategy of where to go now?
I'll' try attaching the code and also the two explain plans.

select  /*+ RULE  */ 
l.nominal_region,l.nominal_county,
s.weekno,s.year,s.obs_dt,s.data_source
from sub s,location l
where s.loc_id=l.loc_id
and s.proj_id = 'GBW'
and s.weekno is not null
and l.nominal_region is not null
Plan
SELECT STATEMENT  HINT: RULE			
	5 NESTED LOOPS  		
		2 TABLE ACCESS BY INDEX ROWID BTOWATCH.SUB 	
			1 INDEX RANGE SCAN NON-UNIQUE BTOWATCH.INDX_SUB_PROJID 
		4 TABLE ACCESS BY INDEX ROWID BTOWATCH.LOCATION 	
			3 INDEX UNIQUE SCAN UNIQUE BTOWATCH.PK_LOCATION 
Without RULE hint:
Plan
SELECT STATEMENT  CHOOSECost: 12,885  Bytes: 196,571,034  Cardinality: 2,978,349  		
	3 HASH JOIN  Cost: 12,885  Bytes: 196,571,034  Cardinality: 2,978,349  	
		1 TABLE ACCESS FULL BTOWATCH.LOCATION Cost: 518  Bytes: 10,589,376  Cardinality: 378,192  
		2 TABLE ACCESS FULL BTOWATCH.SUB Cost: 9,886  Bytes: 112,586,514  Cardinality: 2,962,803

Open in new window

Milleniumaire

Are you sure the table and indexes have been analyzed or statistics gathered using DBMS_STATS and that they are up-to-date?

The key to the access is the sub.proj_id column. What proportion of the table has the value 'GBW' and other values? Assuming this is the only column in the BTOWATCH.INDX_SUB_PROJID then it maybe that the data is skewed in this column thereby preventing the CBO from producing a reasonable cost.

If this is the case you could look into creating a histogram for this column which would then provide the CBO with more detailed information about how to handle the data.

Alternatively, if the code can be modified you could specify hints that direct the CBO such as which index to user, or to use nested loops or access the tables in a specific sequence. These are only "hints" and so the CBO may still chose to ignore them!

The Oracle 9i Tuning manual will help and can be accessed at:

http://download-uk.oracle.com/docs/cd/B10501_01/server.920/a96533/toc.htm

COMPSUPP

ASKER

Hi
Thanks for reply.

The tables were analysed yesterday with the method_opt => 'FOR ALL COLUMNS SIZE SKEWONLY'
which I thought would create histograms as I knew the data was fairly skewed in the PROJ_ID column.

The s.proj_id column : 60% of the rows are GBW, the rest are split over 5 other projects.

Tuning seems to be quite a tricky bizz . Some of my other queries have just flown compared to RULE . It's just hard to get a handle on what will go well and what won't.

Thanks

Milleniumaire

How many rows do the tables contain? What are the timings compared to running with the RULE hint and without (just to give an idea of the scale of the problem)? If running as part of an online application where you want a fast response, but it isn't necessary to retrieve all rows immediately, then you might consider using the FIRST_ROWS hint. If you remove some of the column from the select statement does the query use the index? I suspect the CBO is using full table scans due to the amount of data it needs to retrieve (60% of rows). This would make sense. It may also be that the CBO identifies that "lots" of data needs to be retrieved so it decides to do a full table scan. Sometimes, adding more columns to the index (or create a new index) can resolve the problem.

I have encountered similar problems in the passed and have successfully used the INDEX, ORDERED and USE_NL hints to "duplicate" the RULE based access.

Give the following a try (if you haven't already):

select /*+ INDEX(s indx_sub_projid) USE_NL(s l) */
l.nominal_region,l.nominal_county,
s.weekno,s.year,s.obs_dt,s.data_source
from sub s,location l
where s.loc_id=l.loc_id
and s.proj_id = 'GBW'
and s.weekno is not null
and l.nominal_region is not null

Just a few ideas.

COMPSUPP

ASKER

Hi again,
LOCATION 400,000 rows
SUB 8 million rows.

Query returns 5 million rows
RULE based query takes 1 s. Cost takes 45 s

I hadn't realised that the number of columns returned would affect the join path. In fact what I sent to you is a cut down version. The actual statement wants several more columns.

But wow!
Putting your hint on, /*+ INDEX(s indx_sub_projid) USE_NL(s l) */

and the data's ack in 31 ms! The first time took 78ms , thereaftetr 31ms!
I'm impressed.

This is an on-line db but this particular query is run at night as a batch job for some static pages.

But now if I'm 'duplicating' RULE based behaviour - how come it's faster?

many thanks

ASKER CERTIFIED SOLUTION

Milleniumaire

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

COMPSUPP

ASKER

you're quite right. It was just the time to return the first pane of results. Querying for the whole dataset was much slower.
Many thanks for your helpful advice. There's an awful lot to this tuning lark!