[Last Call] Learn about multicloud storage options and how to improve your company's cloud strategy. Register Now



Posted on 2004-04-10
Medium Priority
Last Modified: 2007-12-19

I have an overall view about db2 index and its usage. I read this article http://www.dbazine.com/mullins_db2indexing.shtml about db2 index and usage, where i have few doubts.

i am using db2 v7.2 on aix/win2000 with fickpack 11

1)How to decide which index to use (btree or cluster index or any) under what cirsumstances. By looking at a query can we decide or any other information is required to decide this. some example's of query and how to decide this."few points are also mentioned in this site".

2)For all columns in GROUP BY and ORDER BY can i create a cluster index or i need to decide this based on some factors. if ,so what factors should i look and recommend for cluster index.

3)what does "If indexing a table, explicitly create a clustering index" mean ?

please also post some web reference.


Question by:banumaran
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
LVL 18

Assisted Solution

BigSchmuh earned 220 total points
ID: 10804812
1/ Some definitions may help
BTree index is an efficient way to store a tree. This means you can browse it from lowest to highest and back.
Other kind of index generally does not allow to browse but are faster to retrieve one single value. Hash index are good examples.
A Clustered index is an index which is although used to sort the data pages.
Data pages should be kept very dense to avoid fragmentation.
Indexes based on non selective columns are never used because query optimizer detects it will retrieve more than 12-15% of the overall data pages and start with a full scan (which is faster in this case because it is faster to read straight forward a 1Go file than reading 15% of it by small chuncks).

2/ What and when to use indexes
a) Depend on your query
Looking at a specific query, you will find that joining, criteria, grouping or sorting columns are good candidates.
==> The problem is you must "guess" how many rows are to be retrieved at every step of your query to create the best indexes...
Some rules are:
-Criteria columns are to be considered only if they are restrictive enough (<15% of the overall data rows)
-You may look at joining columns as criteria column for Inner join
-Create an index using only the most restrictive columns
-Create a cluster index on the columns you use most of the time to sort your data
-Your Grouping columns should although be ordered from the most restrictive to the least one
-A "criteria" index may be sufixed with Grouping columns and prefixed by Sorting columns
==> Always keep in mind an index is unusefull if not restrictive enough

b) Depend on how many users and updates
To avoid contention, you may use a different clustered index if too many users modified rows on the same data pages than the sorting one.
To allow for faster modifications, you may keep only the necessary indexes.

Hope this helps.

Accepted Solution

LordSilk earned 80 total points
ID: 10807531
to make it complete (btw. nice and clear explanation BigSchmuh) :

3/explicitly create a clustering index
If you don't define one of your indexes as clustering (the index by which your data is ordered on disk), DB2 will decide the clustering index for you. It will take the oldest (first created) index as clustering index.
Problem with this is, when you drop your first index and recreate it, it is no longer the oldest. DB2 will then change the clustering index to (again) the oldest index, which can have a big effect on performance.
To avoid this, you can create 1 index as clustering index so you are always in control of which index is determining the sorting order of your data on disk.



Author Comment

ID: 10807708

Thank you BigSchmuh & silk for your valuable guidence

Featured Post

NEW Veeam Agent for Microsoft Windows

Backup and recover physical and cloud-based servers and workstations, as well as endpoint devices that belong to remote users. Avoid downtime and data loss quickly and easily for Windows-based physical or public cloud-based workloads!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Recursive SQL in UDB/LUW (you can use 'recursive' and 'SQL' in the same sentence) A growing number of database queries lend themselves to recursive solutions.  It's not always easy to spot when recursion is called for, especially for people una…
Recursive SQL in UDB/LUW (it really isn't that hard to do) Recursive SQL is most often used to convert columns to rows or rows to columns.  A previous article described the process of converting rows to columns.  This article will build off of th…
In this video, Percona Solution Engineer Dimitri Vanoverbeke discusses why you want to use at least three nodes in a database cluster. To discuss how Percona Consulting can help with your design and architecture needs for your database and infras…
Want to learn how to record your desktop screen without having to use an outside camera. Click on this video and learn how to use the cool google extension called "Screencastify"! Step 1: Open a new google tab Step 2: Go to the left hand upper corn…

650 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question