Link to home
Start Free TrialLog in
Avatar of maria_rossi
maria_rossi

asked on

SYBASE - A very different duration for 'select count(*)' for 2 tables that have similar rowcount

Hi,

I have  2 tables with similar number of rows, but 1 took 5 mins only and the other took about 1.5 hours.
The tables about 7M rows.
The first table,  took 5 mins and it has non-clustered index.   The second table, took 1.5 hours, and it has a clustered index.

Showplan for the first table:  
 STEP 1
        The type of query is SELECT.
        Evaluate Ungrouped COUNT AGGREGATE.

        FROM TABLE
            q2cprpsl
        Nested iteration.
        Index : ixordprpsl2
        Forward scan.
        Positioning at index start.
        Index contains all needed columns. Base table will not be read.
        Using I/O Size 16 Kbytes for index leaf pages.
        With MRU Buffer Replacement Strategy for index leaf pages.

Showplan for the second table:
 STEP 1
        The type of query is SELECT.
        Evaluate Ungrouped COUNT AGGREGATE.

        FROM TABLE
            q2cquote
        Nested iteration.
        Table Scan.
        Forward scan.
        Positioning at start of table.
        Using I/O Size 16 Kbytes for data pages.
        With MRU Buffer Replacement Strategy for data pages.

Why the big difference and what can be done to improve the query on the second table.
Thanks.
Avatar of maria_rossi
maria_rossi

ASKER

Also, why did the 2nd table (the one with the clustered index) do a table scan?  Thanks.
ASKER CERTIFIED SOLUTION
Avatar of Joe Woodhouse
Joe Woodhouse

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Joe,

Thanks for the response.   Now I understand why the big difference.  I will look into adding a nonclustered index.
1 more question, though.   I did an optdiag on the 2nd table and found that the data page cluster ratio is 0.37.  Did this have an effect on  the runtime?.   IE, even if it did a table scan, would it have run faster if the DPCR was higher?

Maria
With your second question - maybe.

The DPCR is a measure of how "unclustered" the clustered index is. This can be either fragmentation in terms of rows not filling pages fully, or (for datapages or datarows locked tables) how many rows aren't "where they should be" according to the clustered index.

With a COUNT(*) without any WHERE clause though, we have to look at every row in the table anyway. It won't matter if rows are "out of order" (only possible for data-only locked - or DOL - tables) since we have to visit them all anyway.

It *could* make some difference if rows are not packing the pages fully, since it will affect how many pages we have to read to read those rows. And hmm, 0.37 is a pretty low DPCR, so yes, in best case you might see 2-3 times faster if the table was perfectly defragmented. I'm getting that estimate by dividing a perfect DPCR of 1.0 by your DPCR of 0.37... but you won't see 1.0. 0.9 isn't unreasonable to aim for though.

That DPCR suggests some table maintenance is in order. If this is a DOL table then it's time for a "reorg rebuild". If not, then dropping & recreating the clustered index will perfectly clean it up. Warning - this needs an exclusive table lock and free space equal to ~120% of the table size.
You've answered my questions.  Thanks a lot for  the clear explanation.  This can be closed.
Glad it was helpful! Good luck.