Solved

Optimizing IN-ists and OR queries in various Oracle releases.

Posted on 2001-08-31
15
972 Views
Last Modified: 2008-02-01
Oracle introduced IN-list iteration against an index and OR list iteration against an index some time ago.  It showed up in execution plans without any documentation.

Oracle 8.1.7 documentation states that IN-lists will respect an /*+ INDEX(table column) */ hint.  It also states that  OR clauses may be resolved using two separate indexes on the same table and an implicit UNION.

With what release did these features first become operational?
I have a situation where I need the following statement to execute efficiently

SELECT something FROM mytable
WHERE columna IN (1,2,3) OR columnb = 'X';

I would like the optimizer to iterate on the index on columna then combine the results with a set derived from searching the index on column b.

Will the following assure the right strategy in 8.1.7?

SELECT /*+ index (mytable idx_columna idx_columnb */ something FROM mytable
WHERE columna IN (1,2,3) OR columnb = 'X';

Will this or any other syntax work in 8.1.6 or any earlier release?

0
Comment
Question by:tiborhorvath
  • 7
  • 5
  • 3
15 Comments
 
LVL 3

Expert Comment

by:Wadhwa
ID: 6446931
Hi,

The optimizer chooses a Full Table Scan over an index scan of multiple
OR'd values (or an extended IN list), if the total cost of the multiple
OR predicates is greater than the Full Table Scan cost.  This behavior
is acceptable unless large numbers of records are involved which cause
the cost of a Full Table Scan to become restricted.

The use of the index hint (/*+ INDEX(table1 ind1) */) does not help in
this case as the index hint is applied to each of the OR predicates rather
than to the query as a whole.  When the optimizer adds together the cost
of all the OR predicates, it finds that the total cost is greater than
the cost of a Full Table Scan and chooses this instead.

You must use the following hint rather than index hint in case of IN.

      SELECT /*+ USE_CONCAT */ * FROM table1 WHERE col1 IN (1,2,3,4);

Query Plan
------------------------------------------------------------------------
1.4 SELECT STATEMENT   Cost = 4
  2.1 CONCATENATION
    3.1 INDEX RANGE SCAN IND1
    3.2 INDEX RANGE SCAN IND1
    3.3 INDEX RANGE SCAN IND1
    3.4 INDEX RANGE SCAN IND1

The USE_CONCAT hint is available in all verison of oracle 8.x

Hope this helps

Sam..
0
 
LVL 2

Expert Comment

by:noriegaa
ID: 6447135
With what release did these features first become operational?
Will this or any other syntax work in 8.1.6 or any earlier release?

I have worked with similar quereis with version 8.0.5 through versions 8.1.7.

SELECT /*+ index (mytable idx_columna idx_columnb */ something FROM mytable
WHERE columna IN (1,2,3) OR columnb = 'X';

I tested a similar query with scott's demo on employee and there was always a full access to the table even when I specified the indexes each time.  I also tried use_contact with similar results.

explain plan for  
 select --+index(employee ndx_job_id )
        --+index(employee ndx_dept_id)
 *
 from employee
 where job_id IN (667,668,669) OR department_id = 10.

I had created indexes on job_id and department_id.  The demo is created with the bdemobld.sql script in 8.1.7.  Like I said the result is similar for the use_concat hint.

However, if I use the union operator, the indexes are used as hinted.

SQL> explain plan set statement_id = 'TST1' INTO PLAN_TABLE for
  2     select --+index(employee ndx_job_id )
  3      *
  4    from employee
  5   where job_id IN (667,668,669)
  6  UNION
  7   select --+index(employee ndx_dept_id)
  8   *
  9   from employee
 10  where department_id = 10
 11  /

Explained.

  1  SELECT operation, options, object_name
  2  from PLAN_TABLE
  3  WHERE STATEMENT_ID = 'TST1'
  4* ORDER BY ID
SQL> /

OPERATION          OPTIONS         OBJECT_NAME
------------------ --------------- ----------------
SELECT STATEMENT
SORT               UNIQUE
UNION-ALL
INLIST ITERATOR
TABLE ACCESS       BY INDEX ROWID  EMPLOYEE
INDEX              RANGE SCAN      NDX_JOB_ID
TABLE ACCESS       BY INDEX ROWID  EMPLOYEE
INDEX              RANGE SCAN      NDX_DEPT_ID

8 rows selected.

However, there is a need to take a look at the cost vs. rule analysis more in detail with more realistic production size tables, and their implications beyond the analysis.  Could you try using tkrprof and SQL Trace.
0
 
LVL 3

Expert Comment

by:Wadhwa
ID: 6447204
here is the simple test

TEST 1:-
--------
SQL> select /*+ INDEX(emp EMPIDX) */ empno,deptno from emp where deptno in ('30','20','10');

14 rows selected.


Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=582 Card=10 Bytes=50
          )

   1    0   INLIST ITERATOR
   2    1     TABLE ACCESS (BY INDEX ROWID) OF 'EMP' (Cost=582 Card=10
           Bytes=50)

   3    2       INDEX (RANGE SCAN) OF 'EMPIDX' (NON-UNIQUE) (Cost=19 C
          ard=10)





Statistics
----------------------------------------------------------
          8  recursive calls
          0  db block gets
         18  consistent gets
          0  physical reads
          0  redo size
        955  bytes sent via SQL*Net to client
        425  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
         14  rows processed

TEST 2
-------

SQL> select /*+ USE_CONCAT */ empno,deptno from emp where deptno in ('30','20','10');

14 rows selected.


Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=3 Card=15 Bytes=75)
   1    0   CONCATENATION
   2    1     TABLE ACCESS (FULL) OF 'EMP' (Cost=1 Card=5 Bytes=25)
   3    1     TABLE ACCESS (FULL) OF 'EMP' (Cost=1 Card=5 Bytes=25)
   4    1     TABLE ACCESS (FULL) OF 'EMP' (Cost=1 Card=5 Bytes=25)




Statistics
----------------------------------------------------------
          0  recursive calls
         12  db block gets
          4  consistent gets
          0  physical reads
          0  redo size
        955  bytes sent via SQL*Net to client
        425  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
         10  sorts (memory)
          0  sorts (disk)
         14  rows processed

SQL>

Here i did two test , test1 and test2  ,The table emp is analyzed with compute statistics

Comapre the cost of TEST1 and TEST2

Cost in TEST2 is lower then COST in test 1 , It indicates test2 is the better result ,as the rules says lower the cost the better is the plan.

Now you will have a question test2 are having a full table scan then why is it better. Since the table  has a very less number of rows and hence full table scan is faster then index scan.

The optimizer chooses a Full Table Scan over an index scan of multiple
OR'd values (or an extended IN list), if the total cost of the multiple
OR predicates is greater than the Full Table Scan cost.  This behavior
is acceptable unless large numbers of records are involved which cause
the cost of a Full Table Scan to become restricted.

AS i already said that the use of the index hint (/*+ INDEX(table1 ind1) */) does not help in
this case as the index hint is applied to each of the OR predicates rather
than to the query as a whole.  When the optimizer adds together the cost
of all the OR predicates, it finds that the total cost is greater than
the cost of a Full Table Scan and chooses this instead.


Hope this will clear now
Sam


 
0
 
LVL 3

Expert Comment

by:Wadhwa
ID: 6447216
Hi noriegaa,

 You should not post your COMMENT as an ANSWER untill you are 100% sure you are right.  If you put your comments as an proposed answer , the question is locked and people can not put a comments untill your answer is rejected. This will delay the response of a correct and desirable answer. If your comments is ok  then questioniare can accept it as an answer and rank you accordingly.

Sorry to pointing you ,

Sam
0
 
LVL 2

Expert Comment

by:noriegaa
ID: 6448456
Thanks for your comment on the answer rules, Sam (Wadhwa).  I think that my answer is right because it covers the specific case proposed, namely, an index hint utilized over a column evaluated against a list and an index hint utilized over an OR statement in another column. We agree that the optimizer chooses the full scan if both statements appear in the same SQL statement.  We also agree on the fact that tuning a SQL statement is not a precise science, but you have not considered the case specifically proposed in the question, i.e., both a list and and OR, both being evaluated via an index.  When the two clauses are separated, then indexes are used, if these queries can be run in parallel, then performance could be best optimized.

Anthony
0
 
LVL 3

Expert Comment

by:Wadhwa
ID: 6448497
Hi Noriegaa, Tiborhorvath


I am not going out of line of the question. It not only use of index with OR but it is also a question of performance.

To summarize what i said before ,let us again look through the query
SELECT /*+ index (mytable idx_columna idx_columnb */ something FROM mytable
WHERE columna IN (1,2,3) OR columnb = 'X';

In the query as above , in any version of oracle and in with any amount of data reterieved the index hint will not be used at all. and explain plain will always shows you FULL TABLE scan.

INDEX WILL USE IN TWO CASES WITH THE ABOVE QUERY
 (1) if you use a rule hint and you have a index on columnA and columnB then INDEX WILL USE.
SELECT /*+ RULE */ something FROM mytable
WHERE columna IN (1,2,3) OR columnb = 'X';
 
   In this case it does not matter wheather you are reteriving 10 rows or 10000 rows  from a total say 12000.

(2)If you put a USE_CONCAT hint. In this case it matters the selectivity of the columns in the table. and hence it goes for INDEX scan in case of OR when the selectivity is good means you are reteriving less numbers of rows compare to total number , otherwise it will go for FULL Table SCAN  and hence optimizer tooks the right decision with the  use of this hint.

Hope it will helps

Goodluck
Sam


 



0
 
LVL 2

Expert Comment

by:noriegaa
ID: 6451872
In the solution that I proposed, the index is always used, and no full table scan is required.  The two queries run in parallel and the real cost will be ultimately driven by UNION and SORT UNIQUE operations.  On the other hand, the LIST and OR comparisons cause a lengthy full table scan -with multiple or comparisons-, regardless of the hint used, which can be avoided with the solution proposed.
0
PRTG Network Monitor: Intuitive Network Monitoring

Network Monitoring is essential to ensure that computer systems and network devices are running. Use PRTG to monitor LANs, servers, websites, applications and devices, bandwidth, virtual environments, remote systems, IoT, and many more. PRTG is easy to set up & use.

 

Author Comment

by:tiborhorvath
ID: 6454245
This is not helpful if Wadhwa's Test 1 is real.

I am talking about two tables with a million rows each, looking for half a dozen hits.

My question about Oracle Version # is critical.  We are running 8.1.6 and must have the behavior fron Wadhwa's answer.  If that does not work in 8.1.6, we must upgrade to 8.1.7.
0
 
LVL 3

Expert Comment

by:Wadhwa
ID: 6454283
I had tested what i told  in oracle version 8.1.6.

 
0
 
LVL 2

Expert Comment

by:noriegaa
ID: 6455095
I tested Sam's solution (use_concat) under Oracle 8.1.7 and got full table access, but got a good index rowid access under Oracle8 Version 8.0.5 a few minutes ago.  I am willing to double check if I neglected any issues.  In my opinion that there is a threshold in the table cardinality for which one query will better than the other and viceversa. With Oracle8 version 8.0.5 use_concat is as effective as my proposed solution and probably better for certain thresholds. A cost analysis for a true cardinality is then essential, in the production Oracle version.
0
 

Author Comment

by:tiborhorvath
ID: 6455174
You can't use scott/tiger for tuning index usage.  Oracle has a small table rule that overrides hints, I believe the threshold is something like 6 blocks before indexes will be considered.

With large tables, 8.1.6 does use indexes on both legs of the query.
The simplest way to skew nested loops as the choice is to set the init parameter
OPTIMIZER_INDEX_CACHING=100

This is global to the database and only works well if you have no use for hash joins.
I have a strict OLTP, so that is my solution in lieu of hints.

Thanks for your trouble, hope you got some value from this.
0
 
LVL 3

Expert Comment

by:Wadhwa
ID: 6455264
Hi tiborhorvath ,
You can also OPTIMIZER_INDEX_COST_ADJ to lower value say 10 or 15 . The defualt value is 50. By making  less value index is less expensive to CBO. Hence it would not favour for full table scan.
You can also set these parameters dynamically

SQL> alter session set optimizer_index_cost_adj=10;

Session altered.

Hope this helps
Sam



0
 
LVL 2

Accepted Solution

by:
noriegaa earned 200 total points
ID: 6455453
I succesfully tested the options with use_contact in release 3 (version 8.1.7), where somehow I had initially obtained a full table scan.  I was able to obtain desirables index rowid access via both use_concat.  Because AUTOTRACE ON does not imply a high cost difference between the two solutions, and clearly associates a high percent of the cost SORT UNIQUE operation, it is possible that the UNION operator perform betters when one set is not intersectable with the other, i.e., when UNION and UNION ALL operations will produces identical results -ideal case- or the number of duplicated rows is rather very small. I use the following parameter values related to these operations:

hash_area_size = 1048576
optimizer_index_caching = 50
optimizer_index_cost_adj = 25

and

sort_area_size = 262144
sort_area_retained_size = 65536

related to the UNION operator sort.
0
 
LVL 3

Expert Comment

by:Wadhwa
ID: 6461885
Hi tiborhorvath,

Do you want to close the question and rank it accordingly

Thanks
Sam
0
 

Author Comment

by:tiborhorvath
ID: 6462683
My final conclusion is that the optimizer will optimize the statement by comparing estimated costs on

select /*+ FULL*/ ... the whole query as originally composed

against

select /*+ CHOOSE */ something from (
select something from mytable where columna in(1,2,3)
and columnb <> 'X'
union all
select something from mytable where columnb ='X';

0

Featured Post

PRTG Network Monitor: Intuitive Network Monitoring

Network Monitoring is essential to ensure that computer systems and network devices are running. Use PRTG to monitor LANs, servers, websites, applications and devices, bandwidth, virtual environments, remote systems, IoT, and many more. PRTG is easy to set up & use.

Join & Write a Comment

How to Create User-Defined Aggregates in Oracle Before we begin creating these things, what are user-defined aggregates?  They are a feature introduced in Oracle 9i that allows a developer to create his or her own functions like "SUM", "AVG", and…
From implementing a password expiration date, to datatype conversions and file export options, these are some useful settings I've found in Jasper Server.
This video shows setup options and the basic steps and syntax for duplicating (cloning) a database from one instance to another. Examples are given for duplicating to the same machine and to different machines
This video explains what a user managed backup is and shows how to take one, providing a couple of simple example scripts.

705 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

19 Experts available now in Live!

Get 1:1 Help Now