Rouchie
asked on
Improving the efficiency of COUNT
I'm running a simple query to show forum posts. I've expanded my basic query to show a count of all the replies for each forum post:
SELECT
TOP 50 f.postDate AS datePosted,
f.ID,
f.postTitle AS title,
u.[name],
u.surname,
(SELECT COUNT(*) FROM tbl_forums WHERE postParentID = f.ID) as replies
FROM
tbl_Forums f INNER JOIN
tbl_Users u ON f.userID = u.userID
This seems rather inefficient from a resource perspective because it reads as though the count takes place for every row. Maybe I'm wrong, but is there a better way to construct this query??
SELECT
TOP 50 f.postDate AS datePosted,
f.ID,
f.postTitle AS title,
u.[name],
u.surname,
(SELECT COUNT(*) FROM tbl_forums WHERE postParentID = f.ID) as replies
FROM
tbl_Forums f INNER JOIN
tbl_Users u ON f.userID = u.userID
This seems rather inefficient from a resource perspective because it reads as though the count takes place for every row. Maybe I'm wrong, but is there a better way to construct this query??
if you have a index on the field postParentID, this should not be a big issue.
You could try:
SELECT TOP 50 f.postDate AS datePosted,
f.ID,
f.postTitle AS title,
u.[name],
u.surname, FC.[count] AS replies
FROM tbl_Forums F INNER JOIN tbl_Users u ON f.userID = u.userID
INNER JOIN (SELECT postParentID, COUNT(0) [count]
FROM tbl_forums
GROUP BY postParentID) FC ON F.ID = FC.ID
i.e. do all the counts first and then join to the rest of the data, but the TOP 50 probably means this will be slower because it does ALL the counts rather than just 50.
SELECT TOP 50 f.postDate AS datePosted,
f.ID,
f.postTitle AS title,
u.[name],
u.surname, FC.[count] AS replies
FROM tbl_Forums F INNER JOIN tbl_Users u ON f.userID = u.userID
INNER JOIN (SELECT postParentID, COUNT(0) [count]
FROM tbl_forums
GROUP BY postParentID) FC ON F.ID = FC.ID
i.e. do all the counts first and then join to the rest of the data, but the TOP 50 probably means this will be slower because it does ALL the counts rather than just 50.
ASKER
Thanks Fino7, I follow what you're saying.
AngelIII,
There are no indexes at present, other than a primary key index at f.ID.
postParentID is a foreign key of this PK. Not sure if this is the way it should be done or not...
AngelIII,
There are no indexes at present, other than a primary key index at f.ID.
postParentID is a foreign key of this PK. Not sure if this is the way it should be done or not...
tbl_forums has no indexes or primary keys? If postParentID is the primary key on tbl_forums then SQLServer will have made postParentID the clustered index of the table. If it isn't the primary key then add a non-clustered index of just the single column postParentID.
(clustered = the order the rows in the table are physically stored in)
(BTW in the sql I posted earlier F.ID = FC.ID should of course be F.ID = FC.postParentID)
(clustered = the order the rows in the table are physically stored in)
(BTW in the sql I posted earlier F.ID = FC.ID should of course be F.ID = FC.postParentID)
ASKER
>> tbl_forums has no indexes or primary keys?
[ID] is the primary key in tbl_Forums
[postParentID] is the foreign key in tbl_Forums that uses the above primary key to reference replies to the parent post
[ID] is the primary key in tbl_Forums
[postParentID] is the foreign key in tbl_Forums that uses the above primary key to reference replies to the parent post
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
>AngelIII,
>There are no indexes at present, other than a primary key index at f.ID.
>postParentID is a foreign key of this PK. Not sure if this is the way it should be done or not...
then, as Fino7 indicates, there should be an index on the postparentID.
however, I would make the primary key of that table a non-clustered index, and the postparentid a clustered index.
the reason is that a primary key is to lookup single rows, while the postparentid will rather get several rows. best to have those rows together, hence the clustered index should be on that field
>There are no indexes at present, other than a primary key index at f.ID.
>postParentID is a foreign key of this PK. Not sure if this is the way it should be done or not...
then, as Fino7 indicates, there should be an index on the postparentID.
however, I would make the primary key of that table a non-clustered index, and the postparentid a clustered index.
the reason is that a primary key is to lookup single rows, while the postparentid will rather get several rows. best to have those rows together, hence the clustered index should be on that field
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
>> however, I would make the primary key of that table a non-clustered index, and the postparentid a clustered index.
Okay thanks. I've done the following actions based on your suggestions. I would really appreciate you telling me if it's been done correctly!
1. Went into the Indexes/Keys dialog for this table and unchecked the Create UNIQUE box for the ID field
2. Pressed the NEW button, then create an index on postParentID with the following settings:
Selected Index: IX_tbl_Forums
Type: Index
Column Name: postParentID
Index Filegroup: PRIMARY
Create UNIQUE: No
Fill Factor: 0%
Create as CLUSTERED: No
Do not automatically recompute statistics: No
Okay thanks. I've done the following actions based on your suggestions. I would really appreciate you telling me if it's been done correctly!
1. Went into the Indexes/Keys dialog for this table and unchecked the Create UNIQUE box for the ID field
2. Pressed the NEW button, then create an index on postParentID with the following settings:
Selected Index: IX_tbl_Forums
Type: Index
Column Name: postParentID
Index Filegroup: PRIMARY
Create UNIQUE: No
Fill Factor: 0%
Create as CLUSTERED: No
Do not automatically recompute statistics: No
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.