Solved

Improving the efficiency of COUNT

Posted on 2006-11-16
10
448 Views
Last Modified: 2012-08-14
I'm running a simple query to show forum posts.  I've expanded my basic query to show a count of all the replies for each forum post:

SELECT
      TOP 50 f.postDate AS datePosted,
      f.ID,
      f.postTitle AS title,
      u.[name],
      u.surname,
      (SELECT COUNT(*) FROM tbl_forums WHERE postParentID = f.ID) as replies
FROM
      tbl_Forums f INNER JOIN
      tbl_Users u ON f.userID = u.userID

This seems rather inefficient from a resource perspective because it reads as though the count takes place for every row.  Maybe I'm wrong, but is there a better way to construct this query??
0
Comment
Question by:Rouchie
  • 3
  • 3
  • 3
  • +1
10 Comments
 
LVL 142

Expert Comment

by:Guy Hengel [angelIII / a3]
ID: 17956703
if you have a index on the field postParentID, this should not be a big issue.
0
 
LVL 3

Expert Comment

by:Fino7
ID: 17957082
You could try:

SELECT TOP 50 f.postDate AS datePosted,
     f.ID,
     f.postTitle AS title,
     u.[name],
     u.surname, FC.[count] AS replies
FROM tbl_Forums F INNER JOIN tbl_Users u ON f.userID = u.userID
                INNER JOIN (SELECT postParentID, COUNT(0) [count]
                        FROM tbl_forums
                        GROUP BY postParentID) FC ON F.ID = FC.ID

i.e. do all the counts first and then join to the rest of the data, but the TOP 50 probably means this will be slower because it does ALL the counts rather than just 50.
0
 
LVL 25

Author Comment

by:Rouchie
ID: 17957209
Thanks Fino7, I follow what you're saying.

AngelIII,
There are no indexes at present, other than a primary key index at f.ID.
postParentID is a foreign key of this PK.  Not sure if this is the way it should be done or not...
0
 
LVL 3

Expert Comment

by:Fino7
ID: 17957454
tbl_forums has no indexes or primary keys?  If postParentID is the primary key on tbl_forums then SQLServer will have made postParentID the clustered index of the table.   If it isn't the primary key then add a non-clustered index of just the single column postParentID.

(clustered = the order the rows in the table are physically stored in)

(BTW in the sql I posted earlier F.ID = FC.ID should of course be F.ID = FC.postParentID)
0
 
LVL 25

Author Comment

by:Rouchie
ID: 17957624
>> tbl_forums has no indexes or primary keys?

[ID] is the primary key in tbl_Forums
[postParentID] is the foreign key in tbl_Forums that uses the above primary key to reference replies to the parent post
0
Get up to 2TB FREE CLOUD per backup license!

An exclusive Black Friday offer just for Expert Exchange audience! Buy any of our top-rated backup solutions & get up to 2TB free cloud per system! Perform local & cloud backup in the same step, and restore instantly—anytime, anywhere. Grab this deal now before it disappears!

 
LVL 3

Assisted Solution

by:Fino7
Fino7 earned 200 total points
ID: 17957813
Right, sorry, I see what your doing now.

Then make postParentID a non-clustered index.  

As far as whether the design of a child/parent in the same table (which I think you asked earlier) is a good thing to do or not, its hard to tell, it really depends on what your using the table for and what sort of inserts/updates/queries are going to be run against it.  In principal the idea is good so long as the data stored for the parent is the same as that for a child.  i.e. if they fill different columns then they should be in different tables.

You could also write the SQL like this:

SELECT TOP 50 f.postDate AS datePosted,
     f.ID,
     f.postTitle AS title,
     u.[name],
     u.surname,
     COUNT(*) as replies
FROM tbl_Forums F INNER JOIN tbl_forums PF ON F.ID = PF.postParentID
                      INNER JOIN tbl_Users u ON f.userID = u.userID
GROUP BY F.ID, f.postDate, f.postTitle, u.[name], u.surname
0
 
LVL 142

Expert Comment

by:Guy Hengel [angelIII / a3]
ID: 17959968
>AngelIII,
>There are no indexes at present, other than a primary key index at f.ID.
>postParentID is a foreign key of this PK.  Not sure if this is the way it should be done or not...

then, as Fino7 indicates, there should be an index on the postparentID.
however, I would make the primary key of that table a non-clustered index, and the postparentid a clustered index.
the reason is that a primary key is to lookup single rows, while the postparentid will rather get several rows. best to have those rows together, hence the clustered index should be on that field
0
 
LVL 4

Assisted Solution

by:satish_nagdev
satish_nagdev earned 100 total points
ID: 17962256
hi,
in general its advised not to use count(*), i guess PostParentID is not null in your case then u can replace
(SELECT COUNT(*) FROM tbl_forums WHERE postParentID = f.ID)
with
(SELECT COUNT(postParentId) FROM tbl_forums WHERE postParentID = f.ID)

regards,
satish.
0
 
LVL 25

Author Comment

by:Rouchie
ID: 17963313
>> however, I would make the primary key of that table a non-clustered index, and the postparentid a clustered index.

Okay thanks.  I've done the following actions based on your suggestions.  I would really appreciate you telling me if it's been done correctly!

  1. Went into the Indexes/Keys dialog for this table and unchecked the Create UNIQUE box for the ID field
  2. Pressed the NEW button, then create an index on postParentID with the following settings:
         Selected Index: IX_tbl_Forums
         Type:  Index
         Column Name: postParentID
         Index Filegroup: PRIMARY
         Create UNIQUE: No
         Fill Factor: 0%
         Create as CLUSTERED: No
         Do not automatically recompute statistics: No
0
 
LVL 142

Accepted Solution

by:
Guy Hengel [angelIII / a3] earned 200 total points
ID: 17963401

  1. Went into the Indexes/Keys dialog for this table and unchecked the Create UNIQUE box for the ID field
    > I would say that is wrong. the unique should stay, but the CLUSTERED (if set) should be unchecked here

  2. Pressed the NEW button, then create an index on postParentID with the following settings:
         Selected Index: IX_tbl_Forums
         Type:  Index
         Column Name: postParentID
         Index Filegroup: PRIMARY
         Create UNIQUE: No  
         Fill Factor: 0%
         Create as CLUSTERED: No  >> this is "wrong", but can only be set the Yes if there is no other clustered index on the table
         Do not automatically recompute statistics: No
0

Featured Post

What Security Threats Are You Missing?

Enhance your security with threat intelligence from the web. Get trending threat insights on hackers, exploits, and suspicious IP addresses delivered to your inbox with our free Cyber Daily.

Join & Write a Comment

Performance is the key factor for any successful data integration project, knowing the type of transformation that you’re using is the first step on optimizing the SSIS flow performance, by utilizing the correct transformation or the design alternat…
Introduced in Microsoft SQL Server 2005, the Copy Database Wizard (http://msdn.microsoft.com/en-us/library/ms188664.aspx) is useful in copying databases and associated objects between SQL instances; therefore, it is a good migration and upgrade tool…
This video shows, step by step, how to configure Oracle Heterogeneous Services via the Generic Gateway Agent in order to make a connection from an Oracle session and access a remote SQL Server database table.
Viewers will learn how to use the INSERT statement to insert data into their tables. It will also introduce the NULL statement, to show them what happens when no value is giving for any given column.

707 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

13 Experts available now in Live!

Get 1:1 Help Now