Solved

# Concept of indexing

Posted on 2009-04-16
1,226 Views
Hi Experts,
I want to know  the practical usage of doing indexing on a table and how it improves the excuation time of a query.
We were having a table which was taking a considerable amount of time to load, so my colleague put an index over a column called EVENT DATE and he also truncated this EVENT DATE.
I just want to know the basics of indexing, i could have googled it, but I want to know the usage and working in real time scenario from you experts.
0
Question by:aman0711

LVL 12

Assisted Solution

0

LVL 73

Accepted Solution

indexing is very similar to using the A-Z tabs on an english dictionary, or, if you remember them, the card catalogs in a library

or, even better analogy,  like using the "index" in the back of a book.

indexes are a sorted, compact list of pointers to real information you're looking for.

0

LVL 10

Author Comment

Thanks Sean, I was waiting for your response :)

tell me one thing Sean, how truncating the EVENT date column, made the execution faster?
0

LVL 16

Assisted Solution

Rows of data in a table are stored in the database in blocks.  The block size and row size (and other table storage parameters) will determine how many rows can be stored in each block.  Large tables with many thousands or millions of rows therefore require many blocks to store the data.

When you perform a query on a table that doesn't have an index it is necessary for the database to read each block in turn that contains rows for that table.  For each block read it will scan the rows looking for data with the search criteria you specified in your Where clause.  This is known as a FULL TABLE SCAN as all blocks must be read to find the rows you want to retrieve.  This can be very time consuming for large tables as it requires a lot of disk IO.

When you create an index on a column, the index contains a data value and a rowid for each non-null value for that column in the table.  The rowid is a special pointer that identifies the block in the table in which which the row containing the data resides.  The index is therefore a bunch of values with pointers to the main table.  The structure of the index is also important as it is constructed in such a way as to improve the searching of values in the index.

Now, when your query runs and it is looking for a specific value, instead of scanning ALL the blocks in the table, it searches the blocks in the index and when it finds a match it uses the rowid to navigate directly to the required block in the table that contains the required row.

This works well providing you aren't searching for a large proportion of the data in the table as there comes a point were it is more efficient to simply scan the table blocks rather than accessing the table blocks via the index.  Statistics help the Oracle optimiser determine the best access path for a given query as they record information such as the number of rows in the table and the number of distinct values in the index amongst many other things.
0

LVL 73

Expert Comment

can you post the table, index and sql used?  I'm not sure what you're asking with your description

0

LVL 47

Assisted Solution

I will write about B-Tree indexes, based on my PowerPoint Presentation at the Unversity and numerous courses in the industry.

The DB has no idea how the records are placed in the hard disk structures. It can only go through all records sequentially and to try to find the searched record.

To avoid the unpredictable, chaotic and time and resource consuming execution of the statement the Data Bases uses components called Optimizer.

The Cost based Optimizer is able to choose the best execution plan after making and evaluating of hundreds different plans in regard the same statement.

The possibility to speed up the DB operation basically depends on:

Number of hard Disks and placement of the objects on them
Availability of relevant Statistics about the objects  number of
rows, tree structure of the indexes, etc
Availability of supplemental objects  like indexes, partitioned
tables

All this is helping to speed up the data processing!

Binary Search Tree

A Binary Search Tree is a tree each node of which satisfies the conditions:
all child nodes on the left subtree of the node have smaller
key values than the key of the node or the subtree doesnt
exist (leave node)
all child nodes on the right subtree of the node have bigger
key values than the key of the node or the subtree doesnt
exist (leave node)

Why DB use Search Trees?

The DB use search trees to speed up the search of particular records in the Data Base.

Instead to read all records of a table the DB looks in the Search Tree and decides which rows of to table to read from the disk.

The Binary Search Tree is a theoretical notion and can be used only if the nodes are placed in the RAM.

For rows placed on Hard Disks another solution is used  B-Trees. The B-Trees are based on Binary Search Trees but are enhanced and more complex in order to fit to the specific of the Hard Disk functionality and operations.

B-tree.doc
0

LVL 40

Expert Comment

>>We were having a table which was taking a considerable amount of time to load

What do you mean by "load" ? Loading data into a table will not be improved with an index. I assume maybe you meant "load" in terms of a query loading results from the database.
0

LVL 47

Expert Comment

Indexes speed up only SELECT statements and significantly slow down INSERT, DELETE and UPDATE statements.
Often to speed up these statements the indexes get droped, primary and unique key constraints get disabled. After the heavy load all indexes and constraints get rebuild and enabled.
0

LVL 10

Author Comment

@ mrjoltcola,
Yes I mean query loading result. the guy who put ont he indexes told me that because one of the column of the table was EVENT_DATE, he has to truncate the timestamp value from it , to make it faster.
Does this concept really works?
0

LVL 40

Expert Comment

Truncating a column does not make it faster, in itself, but you may have misunderstood his explanation and/or full reasons.

The index should be the same as provided in your SQL query. If not, the index is useless. Does your SQL also include the TRUNC() function on the same criteria as the index? Are you querying based on a specific date? If so, he has probably created a function based index using the TRUNC() function.

Truncating a date field is not required for an index, but may be required to make your SQL perform well by making the index usable for your particular query.

0

LVL 73

Assisted Solution

there could be some side affects to truncating a date.

If the index is compressed (which is not the same thing as "zipping" or other file compression) and you have dates like

2009-04-16 03:12:01
2009-04-16 04:11:32
2009-04-16 04:29:18
2009-04-16 09:24:10
2009-04-16 13:56:22

it's hard for the index to compress those values.

but, if you truncate them then they all become 2009-04-16  which can compress them.

the value itself isn't important, but because they are now all the same then compression can put just one value on the block instead of 5 (in my example, but it could be many more)
this makes the index have fewer blocks to read and fewer leaf nodes, so it becomes faster to scan through.

0

LVL 47

Assisted Solution

Index is like a CAT.
It is a hard task to force an index to be used.
First at all it should be analyzed, the table(s) -also and
the statistics should be delivered.
Second - it will be used if only 2-4% of the rows will be selected
for the particular query. If more rows will be selected, the index is "useless"
for the particular query and Full Table Scann will be choosen.
0

LVL 73

Expert Comment

small but important amendment to above post

"only 2-4% of the rows"

the number of rows is not really important, it's the number of blocks.

of course, if you read very few blocks you'll be reading few rows and if you read many blocks you'll read many rows.
0

LVL 10

Author Closing Comment

Thank you folks :-)
0

LVL 73

Expert Comment

0

## Featured Post

### Suggested Solutions

.csv files 5 62
Salary Amount Format 13 51
Oracle Pivot 2 20
Linking dictionary and columns together 2 36
This post first appeared at Oracleinaction  (http://oracleinaction.com/undo-and-redo-in-oracle/)by Anju Garg (Myself). I  will demonstrate that undo for DML’s is stored both in undo tablespace and online redo logs. Then, we will analyze the reaso…
I annotated my article on ransomware somewhat extensively, but I keep adding new references and wanted to put a link to the reference library.  Despite all the reference tools I have on hand, it was not easy to find a way to do this easily. I finall…
Via a live example show how to connect to RMAN, make basic configuration settings changes and then take a backup of a demo database
This video shows syntax for various backup options while discussing how the different basic backup types work.  It explains how to take full backups, incremental level 0 backups, incremental level 1 backups in both differential and cumulative mode a…