DB2 Performance - SQL select taking 1 hr 10minutes to get a single valu(row). Can I get some helP?

Hi all,

I have a table with 460+ million plus rows. This table is read "once" a day to retrieve a single maxrownum used for downstream processing. The problem I am having is that this select SQL is taking about 1 hr 10-20 minutes to retrieve this one value! How can that be? I have only three indexes one of which is keyed off of the maxrownum.

Can anyone suggest/recommend a performance tip/strategy that I can use to improve my retrieval process? This may not be the best plan, but I was thinking perhaps I could create a tiny whinny table to store the maxrownum value and read from the small table for theis value. But, then I have to introduce another table, additional I/O, extra storage and maintenance, etc, etc in to the equation...So, what does anyone think about that?

Please do send in your comments, suggestion and recommendations. Articles and other pointers are very much welcomed.
EnyinnayaAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Kent OlsenDBACommented:
Hi Enyinnaya,

Several questions.  :)

Is there a primary key?  
Is there an identity column?  Is there an index on the identity column?
Are the statistics on the table current?
What column(s) are indexed?


Kent
Dave FordSoftware Developer / Database AdministratorCommented:
Greetings, Enyinnaya

Let me see if I understand. You have a table with over 460 million rows, and you’re doing a SELECT max(someColumn) on that table, and it takes over an hour for the result to come back, right?

If you have an index on that table over the specified column, then the performance you’re seeing doesn’t make sense.

Perhaps the query doesn’t see the index. I know on DB2 for i, the sort-sequence of the index has to match the sort-sequence of the SQL session. Is there a similar concept in DB2 for Unix?

-- DaveSlash
Kent OlsenDBACommented:
Hi Dave,

There's no equivalent on LUW.

My guess is that there's an identity column that's not indexed.  Most people don't realize that an identity column in DB2 isn't automatically indexed.  (Nor is there a constraint against duplicate values unless there is a unique index on the column, but that's another discussion.)


Kent
Become a CompTIA Certified Healthcare IT Tech

This course will help prep you to earn the CompTIA Healthcare IT Technician certification showing that you have the knowledge and skills needed to succeed in installing, managing, and troubleshooting IT systems in medical and clinical settings.

EnyinnayaAuthor Commented:
Hello kdo, all

Some info about my table definition:

"ROW_NUM" DECIMAL(15,0) NOT NULL GENERATED BY DEFAULT AS IDENTITY (
                    START WITH +1
                    INCREMENT BY +1
                    MINVALUE +1
                    MAXVALUE +999999999999999
                    NO CYCLE
                    CACHE 20
                    NO ORDER ) ,

Here is what I have for indices:

CREATE INDEX "SIEBEL  "."RTL_HIST_IDX" ON "SIEBEL  "."RTL_TRAN_HIST"
                ("ROW_NUM" ASC,
                 "TR_REC_TYPE" ASC,
                 "TR_TR" ASC,
                 "TR_FIELD_NO" ASC,
                 "TR_KEY" ASC)
                COMPRESS NO DISALLOW REVERSE SCANS;


CREATE INDEX "SIEBEL  "."RTL_HIST_J1" ON "SIEBEL  "."RTL_TRAN_HIST"
                ("ROW_NUM" ASC)
                COMPRESS NO DISALLOW REVERSE SCANS;


CREATE INDEX "SIEBEL  "."RTL_TRAN_WK" ON "SIEBEL  "."RTL_TRAN_HIST"
                ("TR_KEY" ASC)
                COMPRESS NO DISALLOW REVERSE SCANS;

ALTER TABLE "SIEBEL  "."RTL_TRAN_HIST" ALTER COLUMN "ROW_NUM" RESTART WITH 461899799;

Observe none is unique or primary key.
Dave FordSoftware Developer / Database AdministratorCommented:
Just for kicks, try adding a primary key on the identity column and see if it makes your query faster.

e.g.
alter table MyTable
add constraint MyTable0
primary key ("ROW_NUM")

Open in new window

EnyinnayaAuthor Commented:
...and here is the insert, select statement that references the big table with row_num as identity.

insert into maxrownum (trgt_tbl, run_dt, start_row) select 'BIG_TABLE', current date, coalesce(max(row_num),0) from BIG_TABLE"
AngocADBACommented:
Hi,

There are several options you can try to improve your performance:
You can change your isolation level, probably you are using a high level, and there are many locks in the table, making the process slower: Select .. From ... WITH ur
Put a lock at table level, instead of putting locks in several rows.
Create a Materialized Query Table to compute that value asynchronically.
Check the disks where the table is stored, more spindles is better to retrieve the data.
If you have to read the whole table make you are retrieving the values at a good rate. You should check the Extent size, the prefetch size (according to the strip), and even the page size.
Make sure the agent that is attending your application is not performing a lot of IO directly, and all rows are retrieved asynchronically with IO servers (Prefetchers)
The buffer pool associated with that table should have enough space to reduce page cleaning synchronically.
Probably, your operation needs sort, make sure the sortheap is enough, or if this sort becomes overflow, make sure the temporary table space is over fast disks.
Check the access plan for bottlenecks, which is the process with more timerons? Also, try to chanche to optimization level, to see if other strategies are better, or you could force te optimizer to use a specific strategy at a certain step by creating an optimizer profile.
For a good analysis (for the optimizer) keep statistics updated
Does your calculation could be realized with the values of the index? if it does, make sure the access plan is doing an index sargable, instead of going to the table.
Is the data reorged?

These are some ideas to improve your query.
Kent OlsenDBACommented:
Hi Enyinnaya,

This just doesn't look right.  Can you post an explain plan of this query?


SELECT MAX(ROW_NUM) FROM "SIEBEL"."RTL_TRAN_HIST"

Open in new window



Thanks,
Kent
Tomas Helgi JohannssonCommented:
Hi!

For a table as big as this 460+ million rows and having indexes where row_num is in ASC order then issuing MAX(row_num) is not a surprise that the time is more than 1 hour when DB2 reads the index from top to bottom.
If you create the index where row_num is in DESC order then you would (should) see dramatic decrease in the query time.

Regards,
    Tomas Helgi
Tomas Helgi JohannssonCommented:
Either change the row_num from ASC to DESC
or
allow reverse scan on the index.

Regards,
    Tomas Helgi
Kent OlsenDBACommented:
Hi Tomas,

I think that he's got more issues than just this.  Selecting the highest value in an indexed column should be the (relatively) simple matter of the DBMS following the "right son" pointer of each leaf node.  If the indexes are anywhere near balanced, the number of pages examined will be around log(f(n)).  Since the query is taking so long, either a full table scan is being used or database activity has resulted in the index being horrible out of balance.  I'm hoping that the explain plan shows index usage.  If so, recreating or reorganizing the index should then solve the performance issue, and improve virtually every query that uses the column.

But you're right.  Sorting descending will get the highest value from a single I/O (the root index page).  As long as that's compatible with his other usage of the table.  :)


Kent
MurpheyApplication ConsultantCommented:
Hi Enyinnaya,

A Select within an Insert is working a little diferent then an "standalone" Select, don't aske me why, I saw that several times. Normaly SQL is taking the optimal index, within an insert, this isn't allways the case.

You can try these options:
- Remove the coalesce() part, I don't think you need it with 460+ mln rows
- Use an Order by to force SQL to look to that index first.
- Try to use the "FETCH FIRST 1 ROW ONLY"
- Create a View on that record. accessing the table will use combination of indexes and views to get the optimal result.

(btw, did you also try to run the Select without the insert part? what was the result?)

Regards,
Murph

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Tomas Helgi JohannssonCommented:
Hi!

Doing a query without a where clause or FETCH FIRST statement
ALWAYS results in a TABLESCAN no matter what the size of the table is.

Ergo large table results in long query time.

Regards,
     Tomas Helgi
kemot1000Commented:
Hi,

To save storage (and gain performance in reorg and runstats) loose this index as it's surplus:

CREATE INDEX "SIEBEL  "."RTL_HIST_J1" ON "SIEBEL  "."RTL_TRAN_HIST"
                ("ROW_NUM" ASC)
                COMPRESS NO DISALLOW REVERSE SCANS;

every query will use this index if you remove above as ROW_NUM is the first index column.

CREATE INDEX "SIEBEL  "."RTL_HIST_IDX" ON "SIEBEL  "."RTL_TRAN_HIST"
                ("ROW_NUM" ASC,
                 "TR_REC_TYPE" ASC,
                 "TR_TR" ASC,
                 "TR_FIELD_NO" ASC,
                 "TR_KEY" ASC)
                COMPRESS NO DISALLOW REVERSE SCANS;

And for sure recreate your indexes to allow reverse scan.


T.
kemot1000Commented:
I would change your insert to:

insert into maxrownum (trgt_tbl, run_dt, start_row) select 'BIG_TABLE', current date, row_num from BIG_TABLE order by row_num desc fetch first row only
EnyinnayaAuthor Commented:
Excellent responses all
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
DB2

From novice to tech pro — start learning today.