?
Solved

Concept of indexing

Posted on 2009-04-16
15
Medium Priority
?
1,237 Views
Last Modified: 2013-12-18
Hi Experts,
                    I want to know  the practical usage of doing indexing on a table and how it improves the excuation time of a query.
                    We were having a table which was taking a considerable amount of time to load, so my colleague put an index over a column called EVENT DATE and he also truncated this EVENT DATE.
                   I just want to know the basics of indexing, i could have googled it, but I want to know the usage and working in real time scenario from you experts.
                  Thanks in advance.
0
Comment
Question by:aman0711
  • 5
  • 3
  • 3
  • +3
15 Comments
 
LVL 12

Assisted Solution

by:Gibu George
Gibu George earned 200 total points
ID: 24159597
0
 
LVL 74

Accepted Solution

by:
sdstuber earned 800 total points
ID: 24159630
indexing is very similar to using the A-Z tabs on an english dictionary, or, if you remember them, the card catalogs in a library

or, even better analogy,  like using the "index" in the back of a book.

indexes are a sorted, compact list of pointers to real information you're looking for.

0
 
LVL 10

Author Comment

by:aman0711
ID: 24159674
Thanks Sean, I was waiting for your response :)

 tell me one thing Sean, how truncating the EVENT date column, made the execution faster?
0
Visualize your virtual and backup environments

Create well-organized and polished visualizations of your virtual and backup environments when planning VMware vSphere, Microsoft Hyper-V or Veeam deployments. It helps you to gain better visibility and valuable business insights.

 
LVL 16

Assisted Solution

by:Milleniumaire
Milleniumaire earned 400 total points
ID: 24159733
Rows of data in a table are stored in the database in blocks.  The block size and row size (and other table storage parameters) will determine how many rows can be stored in each block.  Large tables with many thousands or millions of rows therefore require many blocks to store the data.

When you perform a query on a table that doesn't have an index it is necessary for the database to read each block in turn that contains rows for that table.  For each block read it will scan the rows looking for data with the search criteria you specified in your Where clause.  This is known as a FULL TABLE SCAN as all blocks must be read to find the rows you want to retrieve.  This can be very time consuming for large tables as it requires a lot of disk IO.

When you create an index on a column, the index contains a data value and a rowid for each non-null value for that column in the table.  The rowid is a special pointer that identifies the block in the table in which which the row containing the data resides.  The index is therefore a bunch of values with pointers to the main table.  The structure of the index is also important as it is constructed in such a way as to improve the searching of values in the index.

Now, when your query runs and it is looking for a specific value, instead of scanning ALL the blocks in the table, it searches the blocks in the index and when it finds a match it uses the rowid to navigate directly to the required block in the table that contains the required row.

This works well providing you aren't searching for a large proportion of the data in the table as there comes a point were it is more efficient to simply scan the table blocks rather than accessing the table blocks via the index.  Statistics help the Oracle optimiser determine the best access path for a given query as they record information such as the number of rows in the table and the number of distinct values in the index amongst many other things.
0
 
LVL 74

Expert Comment

by:sdstuber
ID: 24159767
can you post the table, index and sql used?  I'm not sure what you're asking with your description

0
 
LVL 48

Assisted Solution

by:schwertner
schwertner earned 600 total points
ID: 24160141
I will write about B-Tree indexes, based on my PowerPoint Presentation at the Unversity and numerous courses in the industry.

The DB has no idea how the records are placed in the hard disk structures. It can only go through all records sequentially and to try to find the searched record.
Every records read operation accessed the hard disk and causes hard disk read operation, memory loading and delay.

To avoid the unpredictable, chaotic and time and resource consuming execution of the statement the Data Bases uses components called Optimizer.

The Cost based Optimizer is able to choose the best execution plan after making and evaluating of hundreds different plans in regard the same statement.

The possibility to speed up the DB operation basically depends on:

 Number of hard Disks and placement of the objects on them
 Availability of relevant Statistics about the objects  number of
  rows, tree structure of the indexes, etc
 Availability of supplemental objects  like indexes, partitioned
  tables

All this is helping to speed up the data processing!

Binary Search Tree

A Binary Search Tree is a tree each node of which satisfies the conditions:
 all child nodes on the left subtree of the node have smaller
  key values than the key of the node or the subtree doesnt
  exist (leave node)
 all child nodes on the right subtree of the node have bigger
  key values than the key of the node or the subtree doesnt
  exist (leave node)

Why DB use Search Trees?

The DB use search trees to speed up the search of particular records in the Data Base.

Instead to read all records of a table the DB looks in the Search Tree and decides which rows of to table to read from the disk.

The Binary Search Tree is a theoretical notion and can be used only if the nodes are placed in the RAM.

For rows placed on Hard Disks another solution is used  B-Trees. The B-Trees are based on Binary Search Trees but are enhanced and more complex in order to fit to the specific of the Hard Disk functionality and operations.



B-tree.doc
0
 
LVL 40

Expert Comment

by:mrjoltcola
ID: 24160588
>>We were having a table which was taking a considerable amount of time to load

What do you mean by "load" ? Loading data into a table will not be improved with an index. I assume maybe you meant "load" in terms of a query loading results from the database.
0
 
LVL 48

Expert Comment

by:schwertner
ID: 24161169
Indexes speed up only SELECT statements and significantly slow down INSERT, DELETE and UPDATE statements.
Often to speed up these statements the indexes get droped, primary and unique key constraints get disabled. After the heavy load all indexes and constraints get rebuild and enabled.
0
 
LVL 10

Author Comment

by:aman0711
ID: 24161223
@ mrjoltcola,
                           Yes I mean query loading result. the guy who put ont he indexes told me that because one of the column of the table was EVENT_DATE, he has to truncate the timestamp value from it , to make it faster.
                          Does this concept really works?
0
 
LVL 40

Expert Comment

by:mrjoltcola
ID: 24161418
Truncating a column does not make it faster, in itself, but you may have misunderstood his explanation and/or full reasons.

The index should be the same as provided in your SQL query. If not, the index is useless. Does your SQL also include the TRUNC() function on the same criteria as the index? Are you querying based on a specific date? If so, he has probably created a function based index using the TRUNC() function.

Truncating a date field is not required for an index, but may be required to make your SQL perform well by making the index usable for your particular query.

0
 
LVL 74

Assisted Solution

by:sdstuber
sdstuber earned 800 total points
ID: 24161519
there could be some side affects to truncating a date.


If the index is compressed (which is not the same thing as "zipping" or other file compression) and you have dates like

2009-04-16 03:12:01
2009-04-16 04:11:32
2009-04-16 04:29:18
2009-04-16 09:24:10
2009-04-16 13:56:22

it's hard for the index to compress those values.

but, if you truncate them then they all become 2009-04-16  which can compress them.

the value itself isn't important, but because they are now all the same then compression can put just one value on the block instead of 5 (in my example, but it could be many more)
this makes the index have fewer blocks to read and fewer leaf nodes, so it becomes faster to scan through.




0
 
LVL 48

Assisted Solution

by:schwertner
schwertner earned 600 total points
ID: 24165335
Index is like a CAT.
It is a hard task to force an index to be used.
First at all it should be analyzed, the table(s) -also and
the statistics should be delivered.
Second - it will be used if only 2-4% of the rows will be selected
for the particular query. If more rows will be selected, the index is "useless"
for the particular query and Full Table Scann will be choosen.
0
 
LVL 74

Expert Comment

by:sdstuber
ID: 24177090
small but important amendment to above post

"only 2-4% of the rows"

the number of rows is not really important, it's the number of blocks.

of course, if you read very few blocks you'll be reading few rows and if you read many blocks you'll read many rows.
0
 
LVL 10

Author Closing Comment

by:aman0711
ID: 31571073
Thank you folks :-)
0
 
LVL 74

Expert Comment

by:sdstuber
ID: 24245998
glad we could help
0

Featured Post

Concerto Cloud for Software Providers & ISVs

Can Concerto Cloud Services help you focus on evolving your application offerings, while delivering the best cloud experience to your customers? From DevOps to revenue models and customer support, the answer is yes!

Learn how Concerto can help you.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Your data is at risk. Probably more today that at any other time in history. There are simply more people with more access to the Web with bad intentions.
In today's business world, data is more important than ever for informing marketing campaigns. Accessing and using data, however, may not come naturally to some creative marketing professionals. Here are four tips for adapting to wield data for insi…
This video shows how to Export data from an Oracle database using the Datapump Export Utility.  The corresponding Datapump Import utility is also discussed and demonstrated.
This is a high-level webinar that covers the history of enterprise open source database use. It addresses both the advantages companies see in using open source database technologies, as well as the fears and reservations they might have. In this…
Suggested Courses

850 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question