Link to home
Start Free TrialLog in
Avatar of lanterv
lanterv

asked on

SQL query speed improvement

I have the following query ( a view named QBounceUsers);

SELECT     b.id, b.sPopAccount, b.sEmailAddress, b.sBounceType, b.sDate, b.sProcessed, b.todelete, b.date_added, u.ID AS userid, M.Description, M.Email AS fromemail
FROM         dbo.Admin_Mass_Email_Addresses AS M RIGHT OUTER JOIN
                      dbo.bouncelog AS b ON M.Description = b.sPopAccount AND b.sEmailAddress > '' INNER JOIN
                      dbo.Admin_Users AS A ON M.adminID = A.adminID LEFT OUTER JOIN
                      dbo.Users AS u ON u.EMAIL_HOME = b.sEmailAddress OR u.EMAIL_WORK = b.sEmailAddress OR u.Email_Extra_1 = b.sEmailAddress OR 
                      u.Email_Extra_2 = b.sEmailAddress LEFT OUTER JOIN
                      dbo.Users_Info AS ui ON u.ID = ui.ID AND ui.Status_Flag = 'A'
WHERE     (u.EMAIL_HOME > '') AND (b.sBounceType = 'HB' OR
                      b.sBounceType = 'SB' OR
                      b.sBounceType = 'GB') OR
                      (u.EMAIL_WORK > '') AND (b.sBounceType = 'HB' OR
                      b.sBounceType = 'SB' OR
                      b.sBounceType = 'GB') OR
                      (u.Email_Extra_1 > '') AND (b.sBounceType = 'HB' OR
                      b.sBounceType = 'SB' OR
                      b.sBounceType = 'GB') OR
                      (u.Email_Extra_2 > '') AND (b.sBounceType = 'HB' OR
                      b.sBounceType = 'SB' OR
                      b.sBounceType = 'GB')

Open in new window


I was asked to improve the performance.  Sometimes it will timeout.

There 34 rows in Admin_Mass_Email_Addresses
There are 33,400 rows in bouncelog
There are 133,000 rows in Users and Users_Info
There are 60 rows in admin_users

That's not many rows at all.  What can I do to speed up the process that takes over 10 minutes to complete?

I'm not sure why the joins are setup the way they are, but the result I want is to query that query and get a count grouped by ;

SELECT     COUNT(id) AS emailcount, Description, sPopAccount
FROM         dbo.QBounceUsers AS b
WHERE     (todelete IS NULL) AND (sBounceType = 'HB' OR
                      sBounceType = 'SB' OR
                      sBounceType = 'GB') AND (date_added > '02/17/2014 12:00:00 AM') AND (sEmailAddress <> '') AND (Status_Flag = 'A')
GROUP BY Description, sPopAccount

Open in new window

SOLUTION
Avatar of John_Vidmar
John_Vidmar
Flag of Canada image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
This may not be much faster but should prove a tad more legible:
SELECT  b.id,
        b.sPopAccount,
        b.sEmailAddress,
        b.sBounceType,
        b.sDate,
        b.sProcessed,
        b.todelete,
        b.date_added,
        u.ID AS userid,
        M.Description,
        M.Email AS fromemail
FROM    dbo.bouncelog b
        LEFT JOIN dbo.Admin_Mass_Email_Addresses M ON M.Description = b.sPopAccount
        INNER JOIN dbo.Admin_Users AS A ON M.adminID = A.adminID
        LEFT OUTER JOIN dbo.Users AS u ON u.EMAIL_HOME = b.sEmailAddress
                                          OR u.EMAIL_WORK = b.sEmailAddress
                                          OR u.Email_Extra_1 = b.sEmailAddress
                                          OR u.Email_Extra_2 = b.sEmailAddress
        LEFT OUTER JOIN dbo.Users_Info AS ui ON u.ID = ui.ID
                                                AND ui.Status_Flag = 'A'
WHERE   b.sEmailAddress > ''
        AND b.sBounceType IN ('HB', 'SB', 'GB')
        AND (u.EMAIL_HOME > ''
             OR u.EMAIL_WORK > ''
             OR u.Email_Extra_1 > ''
             OR u.Email_Extra_2 > ''
            )

Open in new window

Avatar of lanterv
lanterv

ASKER

Well, I've tested with all those combinations but still it will timeout.

This is the code I ended up with;
SELECT     b.id
                 , b.sPopAccount
                 , b.sEmailAddress
                 , b.sBounceType
                 , b.sDate
                 , b.sProcessed
                 , b.todelete
                 , b.date_added
                 , u.ID AS userid
                 , M.Description
                 , M.Email AS fromemail
FROM         dbo.bouncelog AS b INNER JOIN
                      dbo.Admin_Mass_Email_Addresses AS M ON M.Description = b.sPopAccount INNER JOIN
                      dbo.Admin_Users AS A ON M.adminID = A.adminID INNER JOIN
                      dbo.Users AS u ON b.sEmailAddress IN (u.EMAIL_HOME
                                                                                        , u.EMAIL_WORK
                                                                                        , u.Email_Extra_1
                                                                                        , u.Email_Extra_2)
 INNER JOIN  dbo.Users_Info AS ui ON u.ID = ui.ID AND ui.Status_Flag = 'A'
WHERE     (b.sBounceType IN ('HB', 'SB', 'GB')) AND (b.sEmailAddress > '')

Open in new window


What schema questions can I answer?
Avatar of lanterv

ASKER

The users table has one primary key, and integer auto-increment id column.  And there is one clustered index on that key.  I was considering indexing on first and last name and possibly email addresses. Any thoughts?
Why not provide us with an execution plan? (.sqlplan file)

http://msdn.microsoft.com/en-us/library/ms190646.aspx

An execution plan it the first thing to look at for potential optimizations.
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of lanterv

ASKER

Here you go.
Avatar of lanterv

ASKER

ScottPletcher:  I agree, but I'm unable to do that just now.  You should see the rest of the DB.
Avatar of lanterv

ASKER

SELECT     b.id, b.sPopAccount, b.sEmailAddress, b.sBounceType, b.sDate, b.sProcessed, b.todelete, b.date_added, u.ID AS userid, M.Description, M.Email AS fromemail
FROM         dbo.bouncelog AS b INNER JOIN
                      dbo.Admin_Mass_Email_Addresses AS M ON M.Description = b.sPopAccount INNER JOIN
                      dbo.Admin_Users AS A ON M.adminID = A.adminID INNER JOIN
                          (SELECT     Users.ID, Users_Email.EmailAddress
                            FROM          dbo.Users CROSS APPLY(VALUES ((EMAIL_HOME), 
                                                                                                       (EMAIL_WORK), 
                                                                                                       (Email_Extra_1), 
                                                                                                       (Email_Extra_2))
                                                                                              AS Users_Email(EmailAddress)
                                                                                         WHERE     Users_Email.EmailAddress > '') AS u ON u.EmailAddress = b.sEmailAddress LEFT OUTER JOIN
dbo.Users_Info AS ui ON u.ID = ui.ID AND ui.Status_Flag = 'A'
WHERE     (b.sBounceType IN ('HB', 'SB', 'GB')) AND (b.sEmailAddress > '')

Open in new window


Error near VALUES

This is my first attempt at using CROSS APPLY
ASKER CERTIFIED SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of lanterv

ASKER

It's SQL Server 2012
Avatar of lanterv

ASKER

Compatibility level is 2005 (90)
Oh wait, you're missing a closing paren as well, but that should give you a different error than one near "VALUES":

   (Email_Extra_2)))
Avatar of lanterv

ASKER

Incorrect syntax near 'values'
>> Compatibility level is 2005 (90) <<
As I stated earlier:
"If your db is set to use SQL 2005 level, the VALUES clause won't work directly."

Try running it from a different db and qualifying the db name on the tables names in the FROM clause.

Why on earth are you still in 2005 mode on a 2012 instance??
Avatar of lanterv

ASKER

Good question.  And I will get an answer.
Avatar of lanterv

ASKER

I can't find a reason but suspect it was a matter of migration over the years.  I can't find a good reason to not change it to level 110.
Very likely that's all it is.  The db gets restored and no one even thinks about having to reset its compatibility level.
Avatar of lanterv

ASKER

Well, I feel like an idiot.  I'm using a local copy of SSMS that's 2012 but the remote SQL server is actually 2005.  When will I ever learn?

I added an index of the email columns and the query actually runs now (instead of timing out).
Scott has proposed the use of CROSS APPLY with VALUES; this is an efficient method to "unpivot" data, but is version dependent. In earlier versions you can produce the same effect by using UNION ALL (it's not as efficient though)

select
      id as user_id
    , ca1.EmailAddress
from users
cross apply (
  
              select EMAIL_HOME
              union all
              select EMAIL_WORK
              union all
              select Email_Extra_1
              union all
              select Email_Extra_2
  
              ) AS ca1 (EmailAddress)
;

| USER_ID | EMAILADDRESS |
|---------|--------------|
|       1 |         x1@2 |
|       1 |         x1@3 |
|       1 |         x1@4 |
|       1 |         x1@5 |
|       2 |         x2@1 |
|       2 |         x2@2 |
|       2 |         x2@3 |
|       2 |         x2@4 |

CREATE TABLE Users
	([ID] int, [EMAIL_HOME] varchar(4), [EMAIL_WORK] varchar(4), [Email_Extra_1] varchar(4), [Email_Extra_2] varchar(4))
;
	
INSERT INTO Users
	([ID], [EMAIL_HOME], [EMAIL_WORK], [Email_Extra_1], [Email_Extra_2])
VALUES
	(1, 'x1@2', 'x1@3', 'x1@4', 'x1@5'),
	(2, 'x2@1', 'x2@2', 'x2@3', 'x2@4')
;

http://sqlfiddle.com/#!3/4bc4e/1

Open in new window

{+ edit} It may be worth trying this in lieu of the cross apply/value in Scott's query.

If not useful for that query, you may find this useful when normalizing that data in the future.
Avatar of lanterv

ASKER

Indexing the users table on email addresses seemed to help performance the most.