Link to home
Start Free TrialLog in
Avatar of RickDeet
RickDeet

asked on

Reducing enormous logical reads on SQL 2000

Hey there,

I have a stored procedure that is producing high scan counts (516700 & 1336097) and logical reads(1043003 & 2674816) on a couple my 'UserGroupUserMember' and 'UserGroupSet' tables respectively. I'm new to being a DBA and can't figure out how to improve performance on the queries.

I've tried creating a clustered index as opposed to the non-clustered one in place and updating the statistics, but to no avail. Any ideas on how to improve performance on the following stored procedure?

Thanks in advance...

Rick

/////////////////////////////////////////////////////////

CREATE PROCEDURE [dbo].ReportParamProgress_UPAR(
    @Parameter       VARCHAR(10),
    @UserGroupKey    INTEGER,
    @CourseGroupKey  INTEGER,
    @UserKey         INTEGER = 0 )
AS
BEGIN

        IF ( @Parameter = 'U')
        BEGIN
                  SELECT DISTINCT
                         u.LastName + ',  ' + u.FirstName + ' (' + u.UserName + ')' "User",
                         u.UserKey "UserKey",
                         LOWER(u.LastName) "Last Name",
                         LOWER(u.FirstName) "First Name",
                         LOWER(u.UserName) "Username"
                  FROM
                 ( SELECT ugum2.MemberKey UserKey , ugcs.CourseKey CourseKey
                     FROM UserGroupSet ugs1,
                          UserGroupCourseSet ugcs,
                          UserGroupUserMember ugum1,
                          UserGroupUserMember ugum2
                    WHERE ugs1.UserGroupKey  = @UserGroupKey
                      AND ugs1.MemberKey     = ugum1.UserGroupKey
                      AND ugum1.MemberKey    = ugum2.MemberKey
                      AND ugum2.UserGroupKey = ugcs.UserGroupKey
                    UNION ALL
                   SELECT ugum.MemberKey UserKey , uc.MemberKey CourseKey
                    FROM UserGroupSet ugs1,
                         UserGroupUserMember ugum,
                         UserCourse uc
                   WHERE ugs1.UserGroupKey = @UserGroupKey
                     AND ugs1.MemberKey    = ugum.UserGroupKey
                     AND uc.UserKey        = ugum.MemberKey
                   UNION ALL
                   SELECT ugum1.MemberKey UserKey , cgcm.MemberKey CourseKey
                     FROM UserGroupSet ugs1,
                          UserGroupUserMember ugum1,
                          UserCourseGroupSet ucgs,
                          CourseGroupSet cgs,
                          CourseGroupCourseMember cgcm
                    WHERE ugs1.UserGroupKey   = @UserGroupKey
                      AND ugs1.MemberKey      = ugum1.UserGroupKey
                      AND ugum1.MemberKey     = ucgs.UserKey
                      AND ucgs.CourseGroupKey = cgs.CourseGroupKey
                      AND cgs.MemberKey       = cgcm.CourseGroupKey ) m0,
                      UserGroupSet ugs,
                      UserGroupUserMember ugum0,
                      UserProfile u,
                      ( SELECT DISTINCT cgcm.MemberKey CourseKey
                          FROM CourseGroupSet cgs,
                               CourseGroupCourseMember cgcm
                         WHERE cgs.CourseGroupKey = @CourseGroupKey
                           AND cgs.MemberKey      = cgcm.CourseGroupKey ) m1
                WHERE ugs.UserGroupKey   = @UserGroupKey
                  AND m0.UserKey         = ugum0.MemberKey
                  AND ugs.MemberKey      = ugum0.UserGroupKey
                  AND m0.CourseKey       = m1.CourseKey
                  AND ugum0.MemberKey    = u.UserKey
                ORDER BY LOWER(u.LastName),LOWER(u.FirstName),LOWER(u.UserName);
        END
        ELSE IF ( @Parameter = 'C')
        BEGIN
                  SELECT DISTINCT c.Title "Resource",
                                  c.CourseKey "CourseKey",
                                  UPPER(c.Title) "Title"
                  FROM
                    UserCourseSet ucs,
                    Course c,
                   ( SELECT DISTINCT cgcm.MemberKey CourseKey
                       FROM CourseGroupSet cgs,
                            CourseGroupCourseMember cgcm
                      WHERE cgs.CourseGroupKey = @CourseGroupKey  
                        AND cgs.MemberKey      = cgcm.CourseGroupKey ) m1
                   WHERE ucs.UserKey           = @UserKey
                     AND m1.CourseKey          = ucs.CourseKey  
                     AND m1.CourseKey          = c.CourseKey
                  ORDER BY UPPER(c.Title), c.CourseKey ;  
        END
       

END

GO
Avatar of gdbjohnson
gdbjohnson

Well this is a complicated query, but I think I can see the root of your problem.

You want to reduce your result sets as much as possible within the body of the query and the DISTINCT part of your query is where most of your time is being eaten, I'd wager.

Since you want to get a DISTINCT set of names, and you are simply unioning this set, you can actually use the names like Keys!  This is a very useful fact.

Since the most important thing is speed, the easiest way to get what you want here is to use a temporary table.

Take your first query subsection:


-----------
SELECT ugum2.MemberKey UserKey , ugcs.CourseKey CourseKey
                    FROM UserGroupSet ugs1,
                         UserGroupCourseSet ugcs,
                         UserGroupUserMember ugum1,
                         UserGroupUserMember ugum2
                   WHERE ugs1.UserGroupKey  = @UserGroupKey
                     AND ugs1.MemberKey     = ugum1.UserGroupKey
                     AND ugum1.MemberKey    = ugum2.MemberKey
                     AND ugum2.UserGroupKey = ugcs.UserGroupKey
----------

and dump it to a temp table, call it #TEMP.  Then take the next subsection:

---------
SELECT ugum.MemberKey UserKey , uc.MemberKey CourseKey
                   FROM UserGroupSet ugs1,
                        UserGroupUserMember ugum,
                        UserCourse uc
                  WHERE ugs1.UserGroupKey = @UserGroupKey
                    AND ugs1.MemberKey    = ugum.UserGroupKey
                    AND uc.UserKey        = ugum.MemberKey
--------

but change it to the following:
-----
INSERT INTO #TEMP
SELECT ugum.MemberKey UserKey , uc.MemberKey CourseKey
FROM UserGroupSet ugs1
inner join
UserGroupUserMember ugum
on ugs1.MemberKey    = ugum.UserGroupKey
inner join
UserCourse uc
on uc.UserKey        = ugum.MemberKey
left join #TEMP B
on ugum.UserKey = B.UserKey and uc.MemberKey = B.UserKey
WHERE ugs1.UserGroupKey = @UserGroupKey
and B.UserKey is null
-------

Basically, you will be breaking up your single query into several queries, each one adding to this #TEMP table.  You will notice that the above query using the left join eliminates the need for the DISTINCT keyword, which is what slows your query down.

Also, I've used the inner join syntax above...  I'm not sure if that syntax results in faster queries, or if the SQL Parser treats them equivalently, but I usually write my queries using this syntax with excellent results.  It's also generally easier to read.

I hope this helps you out.  Basically, try to think on how to remove the DISTINCT from your query.  It's always slow.  Esp if your result set is large.



Observations:
1)The query
SELECT ugum2.MemberKey UserKey , ugcs.CourseKey CourseKey
                    FROM UserGroupSet ugs1,
                         UserGroupCourseSet ugcs,
                         UserGroupUserMember ugum1,
                         UserGroupUserMember ugum2
                   WHERE ugs1.UserGroupKey  = @UserGroupKey
                     AND ugs1.MemberKey     = ugum1.UserGroupKey
                     AND ugum1.MemberKey    = ugum2.MemberKey
                     AND ugum2.UserGroupKey = ugcs.UserGroupKey
is in fact equivalent with
SELECT ugum.MemberKey UserKey , ugcs.CourseKey CourseKey
                    FROM UserGroupSet ugs1,
                         UserGroupCourseSet ugcs,
                         UserGroupUserMember ugum
                   WHERE ugs1.UserGroupKey  = @UserGroupKey
                     AND ugs1.MemberKey     = ugum.UserGroupKey
                     AND ugum.UserGroupKey = ugcs.UserGroupKey
(Why appears the UserGroupUserMember  twice, and achieving nothing by it???)
*****************

2) In general you use selects in the from clause.
These are in fact tables with no indexes on them (these is how they are treated).
For m0 it should be a better approach to create a temporary table, with an index on it on the
CourseKey. This would speed up a lot.

********************
3) The same goes for the m1 (here the gain is less)
********

4) If the sp (stored procedure) is called many times, then it would be better to create
2 stored procedure, one treating the parameter ='U' and one for 'c'
something like this:
create my proc(..)
AS
IF ( @Parameter = 'U')
       BEGIN
         exec sp1
       END
       ELSE IF ( @Parameter = 'C')
       BEGIN
        exec sp2
       end;
This way the query optimezer will be used as its best
**************************

5)The distinct is killing you everywhere. Thing very well where you need it.
For example:
in the 'C' part, the m1 contains many duplicates? If not so many than you can omit the distinct
because it appears upwards in the same sql. So something like this:
.....
       ELSE IF ( @Parameter = 'C')
       BEGIN
                 SELECT DISTINCT c.Title "Resource",
                                 c.CourseKey "CourseKey",
                                 UPPER(c.Title) "Title"
                 FROM
                   UserCourseSet ucs,
                   Course c,
                  ( SELECT cgcm.MemberKey CourseKey
                      FROM CourseGroupSet cgs,
                           CourseGroupCourseMember cgcm
                     WHERE cgs.CourseGroupKey = @CourseGroupKey  
                       AND cgs.MemberKey      = cgcm.CourseGroupKey ) m1
                  WHERE ucs.UserKey           = @UserKey
                    AND m1.CourseKey          = ucs.CourseKey  
                    AND m1.CourseKey          = c.CourseKey
                 ORDER BY UPPER(c.Title), c.CourseKey ;  
       END
Plus here comes the same optimization as at 2). Create a temporary table, with an index on it
on the CourseKey field (better a clustered one).
The gain of using a temporary table + index is better if the table has many records.

GL
        if you needan complete example tell me
xenon
ASKER CERTIFIED SOLUTION
Avatar of xenon_je
xenon_je

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of Lowfatspread
I presume you recompiled the procedure when you made the index changes...

if your keeping the procedure this complex you may benefit from using the with recompile option on the
procedure definition so that the optimum access path is
chosen each time dependant on the available stats  

I agree that probably using temporary tables is the way
to go with it however ... the simpler you sql is the better the optimiser can work...

hth  
good luck
Avatar of RickDeet

ASKER

First of all, thanks for all the quick comments. I really appreciate it! I couldn't wait to get to work and try the suggestions out.

Gdbjohnson et al, the temporary table suggestion is golden. I haven't worked with them before, but I'm learning trying to implement the suggestions here. I'm not there yet, but working on it.

Xenon, the duplication you mentioned is necessary for this reason: when a user signs up for course (a course contains multiple classes), there is a CourseGroupKey associates with that course. When they sign up for a class inside that course, there is a CourseKey associated with that class. Here's where the duplication comes in... I want to report on not only which Courses the user is signed up for, but also which classes the user is signed up for.

If I remove the duplication, I only get that they are signed up for at least one class in a course, not all the classes. I wouldn't be able to report on their progress for each class without the duplication.

Your suggestion for the 2 sps is interesting. I will see if I can implement that into the page...

By the way, I didn't write this behemoth, I'm just charged with optimizing it!

Rick
LowFat...

I'm hoping to get the thing working with temp tables. If I end up having to keep it closer to the way it is, I'll definitely add the RECOMPILE statement.

Thanks!

Rick
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
So here it goes:
**********
create table UserCourseSet ( userKey int, CourseKey int)
create table Course (CourseKey int, Title varchar(20))
create table CourseGroupCourseMember (CourseGroupKey int, MemberKey int)
create table CourseGroupSet (CourseGroupKey int, MemberKey int)

insert into CourseGroupCourseMember Values(11, 100)
insert into CourseGroupCourseMember Values(12,101)
insert into CourseGroupCourseMember Values(11, 101)
insert into CourseGroupSet Values(33,11)
insert into CourseGroupSet Values(33,12)
SELECT cgcm.MemberKey CourseKey
                      FROM CourseGroupSet cgs,
                           CourseGroupCourseMember cgcm
                     WHERE
                       cgs.MemberKey      = cgcm.CourseGroupKey

insert into Course  Values(100, 'Title1')
insert into Course  Values(101, 'Title2')
insert into Course  Values(101, 'Title3')

insert into UserCourseSet Values(7, 100)
insert into UserCourseSet Values(8, 101)

SELECT DISTINCT c.Title "Resource",
                                 c.CourseKey "CourseKey",
                                 UPPER(c.Title) "Title"
                 FROM
                   UserCourseSet ucs,
                   Course c,
                  ( SELECT cgcm.MemberKey CourseKey
                      FROM CourseGroupSet cgs,
                           CourseGroupCourseMember cgcm
                     WHERE --cgs.CourseGroupKey = @CourseGroupKey   AND
                       cgs.MemberKey      = cgcm.CourseGroupKey ) m1
                  WHERE --ucs.UserKey           = @UserKey AND
                    m1.CourseKey          = ucs.CourseKey  
                    AND m1.CourseKey          = c.CourseKey
                 ORDER BY UPPER(c.Title), c.CourseKey

SELECT DISTINCT c.Title "Resource",
                                 c.CourseKey "CourseKey",
                                 UPPER(c.Title) "Title"
                 FROM
                   UserCourseSet ucs,
                   Course c,
                  ( SELECT DISTINCT cgcm.MemberKey CourseKey
                      FROM CourseGroupSet cgs,
                           CourseGroupCourseMember cgcm
                     WHERE --cgs.CourseGroupKey = @CourseGroupKey   AND
                       cgs.MemberKey      = cgcm.CourseGroupKey ) m1
                  WHERE --ucs.UserKey           = @UserKey AND
                    m1.CourseKey          = ucs.CourseKey  
                    AND m1.CourseKey          = c.CourseKey
                 ORDER BY UPPER(c.Title), c.CourseKey


drop table UserCourseSet
drop table Course
drop table CourseGroupCourseMember
drop table CourseGroupSet
*********************

As you can see above even if I use or not the distinct, the final result is the same. If you give me a sample code in which it will return different results than I'll eat my socks :)))))))))

I'll wait you example....
     xenon
whoa Xenon...
do you get paid for this?
gdbjohnson, I don't get paid for this.. but it's nice to try to understand what others want.... :)))
haha...

No, I know what you mean.
I just joined here - it's a good forum for helping others and improving your own skills in the process - it's just that I noticed your name in several places which must be quite an investment of time on your part.

...Which is great for those who need help.  :)
That was not DDL.  All of the tables had NULL-able columns, so it was IMPOSSIBLE for them to have keys.  If you don't have keys, they are not tables; they are imitations of a file that you might use to scrub data, but they are not tables.  

You have improper postfix "-key"  which I assume are trying to tell the reader that those columns are keys **somewhere** in the schema.  Name a data element for what it is, not for how you use it.  So, I have to make some guesses.  

1) Can I assume things like UserCourseSets are the *actual way * that people refer to this relationship?  Nobody ever says something like "Enrollment"?  Hard to believe, but possible.  Here is a guess at some of the DDL that would be rfequired to make this a usable table:

CREATE TABLE UserCourseSets
(user_id INTEGER NOT NULL  -- where are the users?
      REFERENCES Users(user_id)
       ON UPDATE CASCADE
       ON DELETE CASCADE,
 course_nbr INTEGER NOT NULL
       REFERENCES Courses(course_nbr)
       ON UPDATE CASCADE
       ON DELETE CASCADE,
 PRIMARY KEY(user_id, course_nbr));  

I made the assumption about the key;  I assumeed without specs that this is a relationship between users and the courses they take.  A course can have many users and a user can take many courses.  

2) I had to guess at the key on this one; is title also unique?

CREATE TABLE Courses
(course_nbr INTEGER NOT NULL PRIMARY KEY,
 title VARCHAR(20) NOT NULL);

3) The following two tables are IDENTICAL in structure!!  That means they are sets of the same kind of things; that means one is redundant and your schema is not right.  Look up Chris Date and orthogonal design.  

CREATE TABLE CourseGroupCourseMembers
(course_group_id INTEGER NOT NULL,
 member_nbr INTEGER NOT NULL);

CREATE TABLE CourseGroupSets
(course_group_id INTEGER NOT NULL,
 member_nbr INTEGER NOT NULL);

If they are different in some logical way, then you need to add a column for the attribute that makes them different.  

Doesn't the redundancy in a name like "-GroupSets" bother you when you read it?   Why not just a plain old CourseGroup?  Things should probably look more like this:

CREATE TABLE CourseGroups
(course_group_id INTEGER NOT NULL,
 member_nbr INTEGER NOT NULL UNIQUE
       REFERENCES Members (member_id)
       ON UPDATE CASCADE
       ON DELETE CASCADE,
 member_type CHAR(1) DEFAULT 'x' NOT NULL
       CHECK(member_type IN ('x', 'y', 'z',..),
 PRIMARY KEY (course_group_id, member_nbr));


I am guessing that a member belongs to one and only one course group.  Also, where is the Members table?  If it is a foreign key, then we need a REFERENCES() to its base table.

It is weird to things called both users and members in things that deal with courses.  Why are they different?  

Want to try again?
RickDeet:
This old question needs to be finalized -- accept an answer, split points, or get a refund.  For information on your options, please click here-> http:/help/closing.jsp#1 
EXPERTS:
Post your closing recommendations!  No comment means you don't care.
No comment has been added to this question in more than 268 days, so it is now classified as abandoned.

I will leave the following recommendation for this question in the Cleanup topic area:
   Accept: xenon_je http:#8165534, jcelko http:#8170283

Any objections should be posted here in the next 4 days. After that time, the question will be closed.

monosodiumg
EE Cleanup Volunteer