?
Solved

COUNT(1) Performance

Posted on 2009-04-28
14
Medium Priority
?
1,858 Views
Last Modified: 2012-06-21
1) We have a huge table with a 4 column composite PK and 2 more columns.
2) SELECT COUNT(1) FROM MyTable; is fast in iseries AS/400 V5R3 20 year old OS and DB i.e. 5 seconds
3) But it is extremely slow in Linux 10.2 DB2 UDB 9.1 both of which are modern i.e. 100 seconds.
4) Data is this table is about 80,00,000 rows. Can you explain me various reasons for this strange behaviour?
0
Comment
Question by:k_murli_krishna
  • 5
  • 4
  • 2
  • +3
14 Comments
 
LVL 1

Assisted Solution

by:MrPhenix
MrPhenix earned 200 total points
ID: 24254209
This could have several reasons:

DB2 is more advanced but not necessary better in speed on huge tables.

Is the new db an exact copy? (meaning, indexes, PK, constraints, is the same)
Is there any statistics generated on the old vs the new server?
0
 
LVL 46

Accepted Solution

by:
Kent Olsen earned 880 total points
ID: 24254253
Hi k_murl_krishna,

DB2 will actually count the items, either via an appropriate index or by doing a full table scan.  Since the table has a primary key, DB2 should do an index scan and count the number of items based on the number of items in the index.

As far as I know, the statistics are used only to allow DB2 to optimize your query.  DB2 does not rely on the statistics values as row counts.


Kent
0
 
LVL 17

Author Comment

by:k_murli_krishna
ID: 24257171
Both database schemas are exact copy of each other. Do REORG and RUNSTATS not done quickly enough create the 5 and 100 secs difference? We found that there is no exclusive lock which got released later giving rise to so much wait time. Could it be more queries are already fired on that table and there is I/O contention now that COUNT(1) is also fired. Can hardware configuration differences create so much difference?

Can there be any other special reasons for COUNT(1) on SUSE Linux 10.2 DB2 9.1 to be muich slower than iseries AS/400 DB2 V5R3? Could OS or its state/status have an impact? Could the linkage between a particular OS and a particular DB2 make the difference? Please think from all possible angles.
0
Get your Disaster Recovery as a Service basics

Disaster Recovery as a Service is one go-to solution that revolutionizes DR planning. Implementing DRaaS could be an efficient process, easily accessible to non-DR experts. Learn about monitoring, testing, executing failovers and failbacks to ensure a "healthy" DR environment.

 
LVL 1

Expert Comment

by:MrPhenix
ID: 24257715
Do you still have the ISeries? Could you in that case do a explain plan on both databases?
Or if you just have the new one, could you please post the explain plan?
0
 
LVL 46

Assisted Solution

by:Kent Olsen
Kent Olsen earned 880 total points
ID: 24259544
Hi k_murli_kris,

Depending on your linux server, you may have a huge difference in I/O performance.  IDE and SATA drives usually can't match the performance of SCSI or the native AS/400 hardware.

That said...

Why are you doing a count(1)?  There is a subtle difference between count(*) and count(1) in that count(*) counts the number of rows, count(1) should count the number of non-nulls.  I've not seen any IBM documentation that describes the difference, but Oracle warns against using count(1) as it forces a table scan.  Oracle and DB2 are different entities so the warning may not apply to DB2.  Still, it's worth checking out.

On UDB/LUW count(*) should use the index(es) and return a count quite quickly.  count(1) has a slightly different evaluation process, but should come up with the same number of rows.

Try using a count(*) instead of a count(1).


Good Luck,
Kent
0
 
LVL 37

Assisted Solution

by:momi_sabag
momi_sabag earned 220 total points
ID: 24262331
if you table only has 80,000 rows , and it performs an index scan, it shouldn't take 100 seconds, no matter how lame your hardware is
you have some other problem there,
are you sure it's not a lock problem?
did you try to add "with ur" at the end of the query?
0
 
LVL 46

Expert Comment

by:Kent Olsen
ID: 24262506
Hi momi,

I was confused by his row count.  I took it to be 80,000,000 -- a rather large table.

You're right though.  Unless he has large, poorly blocked rows, 80,000 should count quite quickly.


Kent
0
 
LVL 1

Assisted Solution

by:cmaslen
cmaslen earned 300 total points
ID: 24268556
In addition to the above advice particularly in getting the query plans, have a look at the following link on how DB2 handles parallelism and how you can see if you're taking advantage of it. Despite being 20 years old there is a good chance the AS/400 is optimally configured for DB2 and your Linux box is not.

http://publib.boulder.ibm.com/infocenter/db2luw/v9/index.jsp?topic=/com.ibm.db2.udb.admin.doc/doc/c0005287.htm

Christian.
0
 
LVL 17

Author Comment

by:k_murli_krishna
ID: 24273297
Thanks all of you. I have no admin access to either DB2 on Linux or AS/400. But in local database schema table with as many records, I did db2expln on following 3 queries and I got same cost and identical explain plans with index scan occuring over composite PK.

SEELCT COUNT(1) FROM <schema>.<table>;
SEELCT COUNT(*) FROM <schema>.<table>;
SELECT COUNT(first_col_of_pk) FROM <schema>.<table>;
0
 
LVL 17

Author Comment

by:k_murli_krishna
ID: 24273353
Also, the problem is resolved. When I did COUNT(1) it took 45 secs and after REORG on this local table it just took 3 secs. This means the table is fragmented or the index since index scan is what is happening?

Also, does REORG TABLE <Schema>.<Table>; reorganize the indexes of the table as well? Or we need to specify like for runstats, RUNSTATS ON TABLE <Schema>.<Table> WITH DISTRIBUTION AND DETAILED INDEXES ALL; How much & how does runstats influence queries in general and in particular this simple COUNT(1) one?
0
 
LVL 46

Assisted Solution

by:Kent Olsen
Kent Olsen earned 880 total points
ID: 24273386
Hi k_murli_kris,

In this case, a full index scan of the primary key may be almost as costly as a full table scan.  The table has only 6 columns, 4 of which are the primary key.  Depending on the makeup of the key and the two data columns, there may be as many index pages as there are data pages.  If so, the index scan is just as resource demanding as a table scan.

Can you verify the row count?  Is it 80,000 or 80,000,000?


Kent
0
 
LVL 46

Assisted Solution

by:Kent Olsen
Kent Olsen earned 880 total points
ID: 24273436
Wow.

You really did have a fragmented table.  I had thought that the LUW version was built from the AS/400 version so fragmentation wouldn't have been an issue.  (Obviously, I was wrong....)

The REORG process reorganizes the table and indexes.  One of the options is to reorganize just the indexes or a subset of them.

RUNSTATS collects statistics about the table and stores it in a "header" area.  DB2 later uses these statistics to determine how to best process a query.  The collected statistics could cause DB2 to prefer index1 over index2, or even to perform a full table scan if it determined that the use of an index is redundant.


Kent
0
 
LVL 27

Assisted Solution

by:tliotta
tliotta earned 400 total points
ID: 24275210
Minor note -- If V5R3 is the operating system release level on the AS/400, then "20 years old" is close to a couple decades off the mark. For V5R3, going back even five years puts it somewhere close to IBM development work.

Now, if the age refers to when the tables, views and indexes were generated, then even those have been somewhat updated automatically by upgrades as the system was brought forward to V5R3. Further, if those are indeed 20 years old, then there is no possibility of the two databases being identical. A 20-year old set of files on an AS/400, even one at V5R3, was not a SQL database.

It's rare to find even 5-year old SQL databases because the native (non-SQL) data management was easier for most developers and could perform much better in most circumstances. Nor did SQL have a number of capabilities that native data management could provide. (It still doesn't, but other considerations started to outweigh them.)

In any case, be aware that benchmark comparisons between the two might be far from what could be achieved. Either or both platforms can almost certainly do better than what's shown.

Tom
0
 
LVL 17

Author Closing Comment

by:k_murli_krishna
ID: 31575627
Thanks all.
0

Featured Post

Hire Technology Freelancers with Gigs

Work with freelancers specializing in everything from database administration to programming, who have proven themselves as experts in their field. Hire the best, collaborate easily, pay securely, and get projects done right.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

November 2009 Recently, a question came up in the DB2 forum regarding the date format in DB2 UDB for AS/400.  Apparently in UDB LUW (Linux/Unix/Windows), the date format is a system-wide setting, and is not controlled at the session level.  I'm n…
Recursive SQL in UDB/LUW (it really isn't that hard to do) Recursive SQL is most often used to convert columns to rows or rows to columns.  A previous article described the process of converting rows to columns.  This article will build off of th…
This Micro Tutorial will teach you how to add a cinematic look to any film or video out there. There are very few simple steps that you will follow to do so. This will be demonstrated using Adobe Premiere Pro CS6.
Whether it be Exchange Server Crash Issues, Dirty Shutdown Errors or Failed to mount error, Stellar Phoenix Mailbox Exchange Recovery has always got your back. With the help of its easy to understand user interface and 3 simple steps recovery proced…
Suggested Courses

840 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question