Still celebrating National IT Professionals Day with 3 months of free Premium Membership. Use Code ITDAY17

x
?
Solved

DB2 Performance - SQL select taking 1 hr 10minutes to get a single valu(row). Can I get some helP?

Posted on 2012-03-23
16
Medium Priority
?
1,352 Views
Last Modified: 2012-04-17
Hi all,

I have a table with 460+ million plus rows. This table is read "once" a day to retrieve a single maxrownum used for downstream processing. The problem I am having is that this select SQL is taking about 1 hr 10-20 minutes to retrieve this one value! How can that be? I have only three indexes one of which is keyed off of the maxrownum.

Can anyone suggest/recommend a performance tip/strategy that I can use to improve my retrieval process? This may not be the best plan, but I was thinking perhaps I could create a tiny whinny table to store the maxrownum value and read from the small table for theis value. But, then I have to introduce another table, additional I/O, extra storage and maintenance, etc, etc in to the equation...So, what does anyone think about that?

Please do send in your comments, suggestion and recommendations. Articles and other pointers are very much welcomed.
0
Comment
Question by:Enyinnaya
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 4
  • 3
  • 3
  • +4
16 Comments
 
LVL 46

Assisted Solution

by:Kent Olsen
Kent Olsen earned 800 total points
ID: 37758158
Hi Enyinnaya,

Several questions.  :)

Is there a primary key?  
Is there an identity column?  Is there an index on the identity column?
Are the statistics on the table current?
What column(s) are indexed?


Kent
0
 
LVL 18

Expert Comment

by:Dave Ford
ID: 37758351
Greetings, Enyinnaya

Let me see if I understand. You have a table with over 460 million rows, and you’re doing a SELECT max(someColumn) on that table, and it takes over an hour for the result to come back, right?

If you have an index on that table over the specified column, then the performance you’re seeing doesn’t make sense.

Perhaps the query doesn’t see the index. I know on DB2 for i, the sort-sequence of the index has to match the sort-sequence of the SQL session. Is there a similar concept in DB2 for Unix?

-- DaveSlash
0
 
LVL 46

Assisted Solution

by:Kent Olsen
Kent Olsen earned 800 total points
ID: 37758570
Hi Dave,

There's no equivalent on LUW.

My guess is that there's an identity column that's not indexed.  Most people don't realize that an identity column in DB2 isn't automatically indexed.  (Nor is there a constraint against duplicate values unless there is a unique index on the column, but that's another discussion.)


Kent
0
What does it mean to be "Always On"?

Is your cloud always on? With an Always On cloud you won't have to worry about downtime for maintenance or software application code updates, ensuring that your bottom line isn't affected.

 

Author Comment

by:Enyinnaya
ID: 37759333
Hello kdo, all

Some info about my table definition:

"ROW_NUM" DECIMAL(15,0) NOT NULL GENERATED BY DEFAULT AS IDENTITY (
                    START WITH +1
                    INCREMENT BY +1
                    MINVALUE +1
                    MAXVALUE +999999999999999
                    NO CYCLE
                    CACHE 20
                    NO ORDER ) ,

Here is what I have for indices:

CREATE INDEX "SIEBEL  "."RTL_HIST_IDX" ON "SIEBEL  "."RTL_TRAN_HIST"
                ("ROW_NUM" ASC,
                 "TR_REC_TYPE" ASC,
                 "TR_TR" ASC,
                 "TR_FIELD_NO" ASC,
                 "TR_KEY" ASC)
                COMPRESS NO DISALLOW REVERSE SCANS;


CREATE INDEX "SIEBEL  "."RTL_HIST_J1" ON "SIEBEL  "."RTL_TRAN_HIST"
                ("ROW_NUM" ASC)
                COMPRESS NO DISALLOW REVERSE SCANS;


CREATE INDEX "SIEBEL  "."RTL_TRAN_WK" ON "SIEBEL  "."RTL_TRAN_HIST"
                ("TR_KEY" ASC)
                COMPRESS NO DISALLOW REVERSE SCANS;

ALTER TABLE "SIEBEL  "."RTL_TRAN_HIST" ALTER COLUMN "ROW_NUM" RESTART WITH 461899799;

Observe none is unique or primary key.
0
 
LVL 18

Expert Comment

by:Dave Ford
ID: 37759374
Just for kicks, try adding a primary key on the identity column and see if it makes your query faster.

e.g.
alter table MyTable
add constraint MyTable0
primary key ("ROW_NUM")

Open in new window

0
 

Author Comment

by:Enyinnaya
ID: 37760070
...and here is the insert, select statement that references the big table with row_num as identity.

insert into maxrownum (trgt_tbl, run_dt, start_row) select 'BIG_TABLE', current date, coalesce(max(row_num),0) from BIG_TABLE"
0
 
LVL 1

Assisted Solution

by:AngocA
AngocA earned 400 total points
ID: 37760882
Hi,

There are several options you can try to improve your performance:
You can change your isolation level, probably you are using a high level, and there are many locks in the table, making the process slower: Select .. From ... WITH ur
Put a lock at table level, instead of putting locks in several rows.
Create a Materialized Query Table to compute that value asynchronically.
Check the disks where the table is stored, more spindles is better to retrieve the data.
If you have to read the whole table make you are retrieving the values at a good rate. You should check the Extent size, the prefetch size (according to the strip), and even the page size.
Make sure the agent that is attending your application is not performing a lot of IO directly, and all rows are retrieved asynchronically with IO servers (Prefetchers)
The buffer pool associated with that table should have enough space to reduce page cleaning synchronically.
Probably, your operation needs sort, make sure the sortheap is enough, or if this sort becomes overflow, make sure the temporary table space is over fast disks.
Check the access plan for bottlenecks, which is the process with more timerons? Also, try to chanche to optimization level, to see if other strategies are better, or you could force te optimizer to use a specific strategy at a certain step by creating an optimizer profile.
For a good analysis (for the optimizer) keep statistics updated
Does your calculation could be realized with the values of the index? if it does, make sure the access plan is doing an index sargable, instead of going to the table.
Is the data reorged?

These are some ideas to improve your query.
0
 
LVL 46

Expert Comment

by:Kent Olsen
ID: 37761228
Hi Enyinnaya,

This just doesn't look right.  Can you post an explain plan of this query?


SELECT MAX(ROW_NUM) FROM "SIEBEL"."RTL_TRAN_HIST"

Open in new window



Thanks,
Kent
0
 
LVL 25

Assisted Solution

by:Tomas Helgi Johannsson
Tomas Helgi Johannsson earned 400 total points
ID: 37771840
Hi!

For a table as big as this 460+ million rows and having indexes where row_num is in ASC order then issuing MAX(row_num) is not a surprise that the time is more than 1 hour when DB2 reads the index from top to bottom.
If you create the index where row_num is in DESC order then you would (should) see dramatic decrease in the query time.

Regards,
    Tomas Helgi
0
 
LVL 25

Expert Comment

by:Tomas Helgi Johannsson
ID: 37771857
Either change the row_num from ASC to DESC
or
allow reverse scan on the index.

Regards,
    Tomas Helgi
0
 
LVL 46

Expert Comment

by:Kent Olsen
ID: 37771937
Hi Tomas,

I think that he's got more issues than just this.  Selecting the highest value in an indexed column should be the (relatively) simple matter of the DBMS following the "right son" pointer of each leaf node.  If the indexes are anywhere near balanced, the number of pages examined will be around log(f(n)).  Since the query is taking so long, either a full table scan is being used or database activity has resulted in the index being horrible out of balance.  I'm hoping that the explain plan shows index usage.  If so, recreating or reorganizing the index should then solve the performance issue, and improve virtually every query that uses the column.

But you're right.  Sorting descending will get the highest value from a single I/O (the root index page).  As long as that's compatible with his other usage of the table.  :)


Kent
0
 
LVL 16

Accepted Solution

by:
theo kouwenhoven earned 400 total points
ID: 37774224
Hi Enyinnaya,

A Select within an Insert is working a little diferent then an "standalone" Select, don't aske me why, I saw that several times. Normaly SQL is taking the optimal index, within an insert, this isn't allways the case.

You can try these options:
- Remove the coalesce() part, I don't think you need it with 460+ mln rows
- Use an Order by to force SQL to look to that index first.
- Try to use the "FETCH FIRST 1 ROW ONLY"
- Create a View on that record. accessing the table will use combination of indexes and views to get the optimal result.

(btw, did you also try to run the Select without the insert part? what was the result?)

Regards,
Murph
0
 
LVL 25

Expert Comment

by:Tomas Helgi Johannsson
ID: 37776017
Hi!

Doing a query without a where clause or FETCH FIRST statement
ALWAYS results in a TABLESCAN no matter what the size of the table is.

Ergo large table results in long query time.

Regards,
     Tomas Helgi
0
 
LVL 4

Expert Comment

by:kemot1000
ID: 37834900
Hi,

To save storage (and gain performance in reorg and runstats) loose this index as it's surplus:

CREATE INDEX "SIEBEL  "."RTL_HIST_J1" ON "SIEBEL  "."RTL_TRAN_HIST"
                ("ROW_NUM" ASC)
                COMPRESS NO DISALLOW REVERSE SCANS;

every query will use this index if you remove above as ROW_NUM is the first index column.

CREATE INDEX "SIEBEL  "."RTL_HIST_IDX" ON "SIEBEL  "."RTL_TRAN_HIST"
                ("ROW_NUM" ASC,
                 "TR_REC_TYPE" ASC,
                 "TR_TR" ASC,
                 "TR_FIELD_NO" ASC,
                 "TR_KEY" ASC)
                COMPRESS NO DISALLOW REVERSE SCANS;

And for sure recreate your indexes to allow reverse scan.


T.
0
 
LVL 4

Expert Comment

by:kemot1000
ID: 37834919
I would change your insert to:

insert into maxrownum (trgt_tbl, run_dt, start_row) select 'BIG_TABLE', current date, row_num from BIG_TABLE order by row_num desc fetch first row only
0
 

Author Closing Comment

by:Enyinnaya
ID: 37859001
Excellent responses all
0

Featured Post

Free Tool: IP Lookup

Get more info about an IP address or domain name, such as organization, abuse contacts and geolocation.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Hello fellow BSD lovers, I've created a patch process for patching openjdk6 for BSD (FreeBSD specifically), although I tried to keep all BSD versions in mind when creating my patch. Welcome to OpenJDK6 on BSD First let me start with a little …
Attention: This article will no longer be maintained. If you have any questions, please feel free to mail me. jgh@FreeBSD.org Please see http://www.freebsd.org/doc/en_US.ISO8859-1/articles/freebsd-update-server/ for the updated article. It is avail…
Learn several ways to interact with files and get file information from the bash shell. ls lists the contents of a directory: Using the -a flag displays hidden files: Using the -l flag formats the output in a long list: The file command gives us mor…
In a previous video, we went over how to export a DynamoDB table into Amazon S3.  In this video, we show how to load the export from S3 into a DynamoDB table.
Suggested Courses

722 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question