indexing small lookup(dimension tables)

We have an etl process which looks up with a couple of small dimension tables before loading a fact table. Basically looks means left outer join. The source table will do a left outer join with dimension tables. The two dimension tables are a very small look tables and each may contain not more than 20 records.

basically the query looks like below:

select
*
from src, dim1,dim2
where src.id1 = dim1.id1(+)
and src.id2 = dim2.id2(+)


For above kind of query, is good to have indexes created on lookup keys of the dim tables..

We are looking performances issue of some etl process. I think creating indexes on small table causes more harm than improvising..

please let me know what your thoughts are..

Thanks
gs79Asked:
Who is Participating?
 
gs79Author Commented:
Thanks for the response..

That's what we all think that indexing a small table would do more harm than benefit.

But here's an interesting blog where the author  says other wise and explains in depth how we can benefit from indexing even a smaller table

http://richardfoote.wordpress.com/2009/04/16/indexes-on-small-tables-part-i-one-of-the-few/
0
 
Guy Hengel [angelIII / a3]Billing EngineerCommented:
you could create the lookup tables immediately as index organized tables.
but anyhow, the lookup tables, being so small, usually are on 1 or 2 pages of data, so a index (adding 1 more page) would indeed rather harm.

better to eventually pin the lookup table into memory pool ... but if it's used all the time, that will not change anything either.
0
 
Guy Hengel [angelIII / a3]Billing EngineerCommented:
if the index on the small table is a covering index, the db engine will actually and only use th e index data, and not the table data.
while I know that each db has to read not only the table/index data (blocks), but also the metadata to know what it is actually handling, I wonder about how that is relevant to the performance question, as this is something internal and has to be done anyhow.
also, the first page shows that there are 4 consistent gets, which are not actually related to fetching the data, but the metadata.
1) the metadata has to be fetched anyhow
2) shortly after the table being created, the table data is still in memory, so indeed no consistent get is needed there.
0
 
gs79Author Commented:
My small experiment with real time data showed that indexing the small table actually improvised the performance at-least in Oracle as shown below. I agree to some of the arguments in the series of blog by above author..

Thank you for your response.


With Index on TAB_DIM(20 records:


SQL> ed
Wrote file afiedt.buf

  1  insert into test_table
  2  Select a.id
  3  From
  4  SRC_TAB A, TAB_DIM B
  5* Where A.Id = B.Id(+)
SQL> /

212087 rows created.


Execution Plan
----------------------------------------------------------

----------------------------------------------------------------------
| Id  | Operation          | Name       | Rows  | Bytes | Cost (%CPU)|
----------------------------------------------------------------------
|   0 | INSERT STATEMENT   |            |   166K|  3408K|  1562   (1)|
|   1 |  NESTED LOOPS OUTER|            |   166K|  3408K|  1562   (1)|
|   2 |   TABLE ACCESS FULL| STUB_TRANS |   166K|  1298K|  1561   (1)|
|   3 |   INDEX RANGE SCAN | IDX_FR_ID  |     1 |    13 |     1   (0)|
----------------------------------------------------------------------

Note
-----
   - 'PLAN_TABLE' is old version


Statistics
----------------------------------------------------------
        105  recursive calls
       2335  db block gets
       6950  consistent gets
          0  physical reads
    3473784  redo size
        387  bytes sent via SQL*Net to client
        420  bytes received via SQL*Net from client
          3  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
     212087  rows processed

SQL> commit;


--Without Index



SQL> drop index idx_id;

Index dropped.

SQL> ed
Wrote file afiedt.buf

 1  insert into test_table
  2  Select a.id
  3  From
  4  SRC_TAB A, TAB_DIM B
  5* Where A.Id = B.Id(+)


SQL> /

212087 rows created.


Execution Plan
----------------------------------------------------------

--------------------------------------------------------------------------------------
| Id  | Operation             | Name                    | Rows  | Bytes | Cost (%CPU)|
--------------------------------------------------------------------------------------
|   0 | INSERT STATEMENT      |                         |   166K|  3408K|  1566   (2)|
|   1 |  HASH JOIN RIGHT OUTER|                         |   166K|  3408K|  1566   (2)|
|   2 |   TABLE ACCESS FULL   | DW_FRAUD_RESOLUTION_DIM |    30 |   390 |     3   (0)|
|   3 |   TABLE ACCESS FULL   | STUB_TRANS              |   166K|  1298K|  1561   (1)|
--------------------------------------------------------------------------------------

Note
-----
   - 'PLAN_TABLE' is old version


Statistics
----------------------------------------------------------
        407  recursive calls
       2455  db block gets
       7106  consistent gets
          0  physical reads
    3482452  redo size
        388  bytes sent via SQL*Net to client
        420  bytes received via SQL*Net from client
          3  SQL*Net roundtrips to/from client
          7  sorts (memory)
          0  sorts (disk)
     212087  rows processed
0
 
gs79Author Commented:
It can be argued that indexing a small table is actually helpful in improvising the performance which is supported by a small experiment I conducted though theoretically oracle says not to index small tables. Oracle again contradicts saying use whatever works for you!
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.