Solved

Query Tuning: CBO chooses a Full table Scan over the available indexes and doesn't recommend a new index to be built

Posted on 2003-11-24
15
503 Views
Last Modified: 2008-03-10
I have a query that performs an inner join on 2 large files and has a condition or two in the where clause.  When I run this query the optimizer checks the applicable indexes and choose to perform a Full Table Scan over using any of these indexes.  For some reason or another it doesn't suggest an index to be built to avoid the table scan.

Any ideas on how I can get the optimizer to use an index?  Or even how I can get it to suggest a index to be built?
0
Comment
Question by:Jim_McLean
  • 3
  • 3
  • 3
  • +1
15 Comments
 
LVL 2

Expert Comment

by:askanivg
ID: 9812085
Most cases we have a composite index and in the where clause we have a part of the index. This could be one reason.  Also check the order in which indexes are defined and try to have them in the same way in the query.

Second reason which could be happening in your case is that you are using distinct clause which makes the table to have a table scan.

Can you please send the table structure along with indexes and also the query.. Any ways you can try having indexes on the columns present in the where clause and check for some improvement.

HtH
0
 
LVL 2

Expert Comment

by:askanivg
ID: 9812104
you can use the index advisor to suggest the indexes
If you are using UDB then use the
db2advis command.

$db2advis -d <database_name> -i <file_name.sql> -t 1

Above 1 stands for time.

HtH
0
 

Author Comment

by:Jim_McLean
ID: 9812407
Bingo: "Second reason which could be happening in your case is that you are using distinct clause which makes the table to have a table scan."

Is there no way around this?  I have muchos data (several millon rows) in these files and the table scan on my DISTINCT query is a real killer.
0
 
LVL 50

Expert Comment

by:Lowfatspread
ID: 9812669
post the query...

with a brief explanation as to what your trying to achieve..
0
 
LVL 2

Expert Comment

by:askanivg
ID: 9812705
In the join try having the table having less number of records as the first table and then the table with more number of records as the second table. Try having a left outer join or an inner join and see if  you are getting the desired results.

Or think of any other table which is giving the same output of column1 but you do not need to have a distinct clause. Might be you can use that table as dummy but that would definitely avoid using the distinct on millions of records. Check these possibilities..

HtH
0
 
LVL 18

Expert Comment

by:BigSchmuh
ID: 9816351
Having a Table FullScan is not always bad if your criteria are not restrictive enough.
Generally, db optimizers starts a table FullScan instead of an IndexScan when they estimates they will retrieve more than 15-20% of the table because it is faster to read 1 Go straight than to read 150 Mo 4K by 4K pages.

Other facts:
-Without up to date Statistics (RUNSTATS...), DB2 optimizer is totally blind
-Build your index so that the columns are ordered from most restrictive criteria to the least one
-Keep your indexes small by removing your non restrictive columns (Flags excepted)
-You may add columns from your ORDER BY clause in your index, this is very important if you sort your data (GROUP BY, DISTINCT, ORDER BY)
-Order columns are to be added before the first indexed columns that is not used as an equality criteria (LIKE, OR, BETWEEN,<, >...) otherwise DB2 can not use your index to directly search and sort through the data
-Statistics can be modified to force the DB2 optimizer to choose a specific index
-Rarely, you may build a specific index for a query which contains every selected columns at the end of the previously defined "search and sort" index
-I never "Try" a query, I first think of what should be a good AccessPlan for it and I verify that DB2 does it well

By the way, you probably does NOT need the DISTINCT keyword but I can not investigate with no data...
==> I agree with Lowfatspread, please post the SQL query...

Hope this helps.
0
How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

 
LVL 50

Expert Comment

by:Lowfatspread
ID: 9816569
agree with Bigschmuh
I'm sceptical that you need the distinct until we see your actual query..
0
 

Author Comment

by:Jim_McLean
ID: 9817626
FYI,
I am running this query against a iSeries(AS/400) machine.  From my reading I see that there is now way (or need) to update optimizer stats because the OS automatically manages that.

Here is the query:

SELECT DISTINCT b3hdr.bhacct, b3hdr.bhinam, b3sub.bsvnum, b3sub.bsvnm1, b3sub.bsvnm2, b3sub.bscity, b3sub.bsstat, b3sub.bszip
FROM raff01.b3hdr b3hdr INNER JOIN raff01.b3sub b3sub ON bhref=bsref
WHERE bhacct=? AND UPPER(bsvnm1) LIKE '%?%'
ORDER BY bsvnm1;

The b3hdr table is a header table that has sub header records in the b3sub table.  So obviously I have a 1 to many in b3hdr to b3sub.  Now what I am trying to do is get vendor information from the b3sub table, but the problem is that vendors can be repeated over sub headers.  For instance for a single header I could have 10 subheaders.  In these subheaders I could have only 2 unique vendors.  I'm not quite sure how I could do this without a distinct, but I guess thats why I in here asking questions. :)

Thanks for all those tips BigSchmuh!  They'll come in handy for sure.
0
 
LVL 18

Accepted Solution

by:
BigSchmuh earned 250 total points
ID: 9819185
It looks like your are searching for specific Vendors only and use an Uppercase LIKE match.
==> Optimizers will always estimate this kind of criteria not to be very restrictive

Some more questions:
-Is "b3hdr.bhacct" column a not-too-bad criteria (means less than 10% of the b3hdr rows)
-What is your release of DB2 (DB2-UDB v8 may let you create summary table for this kind of query)

Assuming bhacct is a fair criteria and already has a simple index defined [b3hdr(bhacct) without any more columns], I think DB2 can be helped with an index on b3sub(bsref, bsvnm1).

Anyway, you need a DISTINCT (or GROUP BY) clause so DB2 needs to sort your data and that is now where the performance problem is...
==> We can investigate this as well but we need many info about :
-the Temporary tablespaces created and their associated bufferpools and disk sub-systems
-the SORTHEAP and SHEAPTHRES (DB and DBM) parameters
-the use of your db (Mainly transactional with some huge queries like yours or Datamining only)

Hope this helps.
0
 
LVL 18

Expert Comment

by:BigSchmuh
ID: 9819208
I forgot to mention that I doubt that the statistics are up to date but I am not very fluent in DB2/OS390 (Same name but definitely not the same product than DB2/UDB)
==> Please check that SYSIBM.SYSTABLES has its CARD and NPAGES filled in
0
 
LVL 50

Assisted Solution

by:Lowfatspread
Lowfatspread earned 250 total points
ID: 9822964
Ok it looks like you need the distinct....

BigSchmuh
I think the iSeries(AS/400)  uses another DB2 variant... and isn't quite the same as the OS/390 version either..
http://www-1.ibm.com/servers/eserver/iseries/db2/

Jim
If your DB2 automatically keeps the statistics up to date then fine, however this isn't the behaviour I'm used to (DB2/OS390)...

what type of indexes do you have ?
(binary radix or vector encoded)

please post the Table and Index DDL (structures)
any chance you could store the name on the row in upper case (as well) to allow for easier index searching?
any chance you can normalise the design to have the vendor information on a separate table and just referenced?




0
 

Author Comment

by:Jim_McLean
ID: 9851176
Sorry, I've gotten hung up on some other issue... I'll get back to this within a week or two (Hopefully)
0

Featured Post

Highfive Gives IT Their Time Back

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

Join & Write a Comment

Recursive SQL in UDB/LUW (you can use 'recursive' and 'SQL' in the same sentence) A growing number of database queries lend themselves to recursive solutions.  It's not always easy to spot when recursion is called for, especially for people una…
Recursive SQL in UDB/LUW (it really isn't that hard to do) Recursive SQL is most often used to convert columns to rows or rows to columns.  A previous article described the process of converting rows to columns.  This article will build off of th…
When you create an app prototype with Adobe XD, you can insert system screens -- sharing or Control Center, for example -- with just a few clicks. This video shows you how. You can take the full course on Experts Exchange at http://bit.ly/XDcourse.
You have products, that come in variants and want to set different prices for them? Watch this micro tutorial that describes how to configure prices for Magento super attributes. Assigning simple products to configurable: We assigned simple products…

707 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

15 Experts available now in Live!

Get 1:1 Help Now