SQL query speed improvement

I have the following query ( a view named QBounceUsers);

SELECT     b.id, b.sPopAccount, b.sEmailAddress, b.sBounceType, b.sDate, b.sProcessed, b.todelete, b.date_added, u.ID AS userid, M.Description, M.Email AS fromemail
FROM         dbo.Admin_Mass_Email_Addresses AS M RIGHT OUTER JOIN
                      dbo.bouncelog AS b ON M.Description = b.sPopAccount AND b.sEmailAddress > '' INNER JOIN
                      dbo.Admin_Users AS A ON M.adminID = A.adminID LEFT OUTER JOIN
                      dbo.Users AS u ON u.EMAIL_HOME = b.sEmailAddress OR u.EMAIL_WORK = b.sEmailAddress OR u.Email_Extra_1 = b.sEmailAddress OR 
                      u.Email_Extra_2 = b.sEmailAddress LEFT OUTER JOIN
                      dbo.Users_Info AS ui ON u.ID = ui.ID AND ui.Status_Flag = 'A'
WHERE     (u.EMAIL_HOME > '') AND (b.sBounceType = 'HB' OR
                      b.sBounceType = 'SB' OR
                      b.sBounceType = 'GB') OR
                      (u.EMAIL_WORK > '') AND (b.sBounceType = 'HB' OR
                      b.sBounceType = 'SB' OR
                      b.sBounceType = 'GB') OR
                      (u.Email_Extra_1 > '') AND (b.sBounceType = 'HB' OR
                      b.sBounceType = 'SB' OR
                      b.sBounceType = 'GB') OR
                      (u.Email_Extra_2 > '') AND (b.sBounceType = 'HB' OR
                      b.sBounceType = 'SB' OR
                      b.sBounceType = 'GB')

Open in new window


I was asked to improve the performance.  Sometimes it will timeout.

There 34 rows in Admin_Mass_Email_Addresses
There are 33,400 rows in bouncelog
There are 133,000 rows in Users and Users_Info
There are 60 rows in admin_users

That's not many rows at all.  What can I do to speed up the process that takes over 10 minutes to complete?

I'm not sure why the joins are setup the way they are, but the result I want is to query that query and get a count grouped by ;

SELECT     COUNT(id) AS emailcount, Description, sPopAccount
FROM         dbo.QBounceUsers AS b
WHERE     (todelete IS NULL) AND (sBounceType = 'HB' OR
                      sBounceType = 'SB' OR
                      sBounceType = 'GB') AND (date_added > '02/17/2014 12:00:00 AM') AND (sEmailAddress <> '') AND (Status_Flag = 'A')
GROUP BY Description, sPopAccount

Open in new window

lantervAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

John_VidmarCommented:
The left-join to Users is probably the slow-poke:
SELECT	b.id
,	b.sPopAccount
,	b.sEmailAddress
,	b.sBounceType
,	b.sDate
,	b.sProcessed
,	b.todelete
,	b.date_added
,	u.ID AS userid
,	M.Description
,	M.Email AS fromemail
FROM	bouncelog			b
LEFT
JOIN	Admin_Mass_Email_Addresses	M	ON	M.Description = b.sPopAccount
						AND	b.sEmailAddress > ''
JOIN	Admin_Users			A	ON	M.adminID = A.adminID
LEFT
JOIN	Users				u	ON	b.sEmailAddress IN	(	u.EMAIL_HOME
										,	u.EMAIL_WORK
										,	u.Email_Extra_1
										,	u.Email_Extra_2
										)
LEFT
JOIN	Users_Info			ui	ON	u.ID = ui.ID
						AND	ui.Status_Flag = 'A'
WHERE	b.sBounceType IN ('GB','HB','SB')
AND	(	u.EMAIL_HOME > ''
	OR	u.EMAIL_WORK > ''
	OR	u.Email_Extra_1 > ''
	OR	u.Email_Extra_2 > ''
	)

Open in new window

0
Anthony PerkinsCommented:
The left-join to Users is probably the slow-poke:
It is not really a LEFT JOIN despite the best intentions.  There is a condition in the WHERE clause u.EMAIL_HOME > '' that makes it an implied INNER JOIN.   It is not to say that it may not be the root cause, it is just not because it is an OUTER JOIN.

There is a LEFT JOIN ON Admin_Mass _Email_Addresses , but I suspect it is the LEFT JOIN dbo.Users_Info that is the problem (it is 133,000 after all).

But without seeing any schema on the tables involved all we can do is offer wild conjectures.
0
Anthony PerkinsCommented:
This may not be much faster but should prove a tad more legible:
SELECT  b.id,
        b.sPopAccount,
        b.sEmailAddress,
        b.sBounceType,
        b.sDate,
        b.sProcessed,
        b.todelete,
        b.date_added,
        u.ID AS userid,
        M.Description,
        M.Email AS fromemail
FROM    dbo.bouncelog b
        LEFT JOIN dbo.Admin_Mass_Email_Addresses M ON M.Description = b.sPopAccount
        INNER JOIN dbo.Admin_Users AS A ON M.adminID = A.adminID
        LEFT OUTER JOIN dbo.Users AS u ON u.EMAIL_HOME = b.sEmailAddress
                                          OR u.EMAIL_WORK = b.sEmailAddress
                                          OR u.Email_Extra_1 = b.sEmailAddress
                                          OR u.Email_Extra_2 = b.sEmailAddress
        LEFT OUTER JOIN dbo.Users_Info AS ui ON u.ID = ui.ID
                                                AND ui.Status_Flag = 'A'
WHERE   b.sEmailAddress > ''
        AND b.sBounceType IN ('HB', 'SB', 'GB')
        AND (u.EMAIL_HOME > ''
             OR u.EMAIL_WORK > ''
             OR u.Email_Extra_1 > ''
             OR u.Email_Extra_2 > ''
            )

Open in new window

0
Determine the Perfect Price for Your IT Services

Do you wonder if your IT business is truly profitable or if you should raise your prices? Learn how to calculate your overhead burden with our free interactive tool and use it to determine the right price for your IT services. Download your free eBook now!

lantervAuthor Commented:
Well, I've tested with all those combinations but still it will timeout.

This is the code I ended up with;
SELECT     b.id
                 , b.sPopAccount
                 , b.sEmailAddress
                 , b.sBounceType
                 , b.sDate
                 , b.sProcessed
                 , b.todelete
                 , b.date_added
                 , u.ID AS userid
                 , M.Description
                 , M.Email AS fromemail
FROM         dbo.bouncelog AS b INNER JOIN
                      dbo.Admin_Mass_Email_Addresses AS M ON M.Description = b.sPopAccount INNER JOIN
                      dbo.Admin_Users AS A ON M.adminID = A.adminID INNER JOIN
                      dbo.Users AS u ON b.sEmailAddress IN (u.EMAIL_HOME
                                                                                        , u.EMAIL_WORK
                                                                                        , u.Email_Extra_1
                                                                                        , u.Email_Extra_2)
 INNER JOIN  dbo.Users_Info AS ui ON u.ID = ui.ID AND ui.Status_Flag = 'A'
WHERE     (b.sBounceType IN ('HB', 'SB', 'GB')) AND (b.sEmailAddress > '')

Open in new window


What schema questions can I answer?
0
lantervAuthor Commented:
The users table has one primary key, and integer auto-increment id column.  And there is one clustered index on that key.  I was considering indexing on first and last name and possibly email addresses. Any thoughts?
0
PortletPaulfreelancerCommented:
Why not provide us with an execution plan? (.sqlplan file)

http://msdn.microsoft.com/en-us/library/ms190646.aspx

An execution plan it the first thing to look at for potential optimizations.
0
Scott PletcherSenior DBACommented:
I think the ORs in the Users join could be causing multiple scans of the table.

I suggest trying CROSS APPLY on the table before the join, to eliminate the OR.
...
                      dbo.Admin_Users AS A ON M.adminID = A.adminID INNER JOIN
                      (SELECT Users.ID, Users_Email.EmailAddress
                       FROM dbo.Users
                       CROSS APPLY (VALUES( (EMAIL_HOME), (EMAIL_WORK), (Email_Extra_1), (Email_Extra_2) ) ) AS Users_Email(EmailAddress)
                       WHERE
                           Users_Email.EmailAddress > '') AS u ON u.EmailAddress = b.sEmailAddress LEFT OUTER JOIN
                      dbo.Users_Info AS ui ON u.ID = ui.ID AND ui.Status_Flag = 'A'
WHERE     ...
...

[Btw, I think we all realize that the fundamental problem here is the un-normalized table row design that has effectively email1, email2, email3 and email4 in the same row, right? :).  If/when you get the chance, you should instead create a Users_Email table.]
0
lantervAuthor Commented:
Here you go.
0
lantervAuthor Commented:
ScottPletcher:  I agree, but I'm unable to do that just now.  You should see the rest of the DB.
0
lantervAuthor Commented:
SELECT     b.id, b.sPopAccount, b.sEmailAddress, b.sBounceType, b.sDate, b.sProcessed, b.todelete, b.date_added, u.ID AS userid, M.Description, M.Email AS fromemail
FROM         dbo.bouncelog AS b INNER JOIN
                      dbo.Admin_Mass_Email_Addresses AS M ON M.Description = b.sPopAccount INNER JOIN
                      dbo.Admin_Users AS A ON M.adminID = A.adminID INNER JOIN
                          (SELECT     Users.ID, Users_Email.EmailAddress
                            FROM          dbo.Users CROSS APPLY(VALUES ((EMAIL_HOME), 
                                                                                                       (EMAIL_WORK), 
                                                                                                       (Email_Extra_1), 
                                                                                                       (Email_Extra_2))
                                                                                              AS Users_Email(EmailAddress)
                                                                                         WHERE     Users_Email.EmailAddress > '') AS u ON u.EmailAddress = b.sEmailAddress LEFT OUTER JOIN
dbo.Users_Info AS ui ON u.ID = ui.ID AND ui.Status_Flag = 'A'
WHERE     (b.sBounceType IN ('HB', 'SB', 'GB')) AND (b.sEmailAddress > '')

Open in new window


Error near VALUES

This is my first attempt at using CROSS APPLY
0
Scott PletcherSenior DBACommented:
The database is at SQL 2008 level right?  If your db is set to use SQL 2005 level, the VALUES clause won't work directly.

It might work if you make sure the db from which the statement is running is at 2008 level, even if your user db itself is not, although then of course you must change your FROM clauses to include the use db name:

USE master

SELECT     b.id, b.sPopAccount, b.sEmailAddress, b.sBounceType, b.sDate, b.sProcessed, b.todelete, b.date_added, u.ID AS userid, M.Description, M.Email AS fromemail
FROM         your_db_name.dbo.bouncelog AS b INNER JOIN
                      your_db_name.dbo.Admin_Mass_Email_Addresses AS M ON M.Description = b.sPopAccount INNER JOIN
                      your_db_name.dbo.Admin_Users AS A ON M.adminID = A.adminID INNER JOIN
                          (SELECT     Users.ID, Users_Email.EmailAddress
                            FROM          your_db_name.dbo.Users CROSS APPLY(VALUES ((EMAIL_HOME),
                                                                                                       (EMAIL_WORK),
                                                                                                       (Email_Extra_1),
                                                                                                       (Email_Extra_2))
                                                                                              AS Users_Email(EmailAddress)
                                                                                         WHERE     Users_Email.EmailAddress > '') AS u ON u.EmailAddress = b.sEmailAddress LEFT OUTER JOIN
dbo.Users_Info AS ui ON u.ID = ui.ID AND ui.Status_Flag = 'A'
WHERE     (b.sBounceType IN ('HB', 'SB', 'GB')) AND (b.sEmailAddress > '')
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
lantervAuthor Commented:
It's SQL Server 2012
0
lantervAuthor Commented:
Compatibility level is 2005 (90)
0
Scott PletcherSenior DBACommented:
Oh wait, you're missing a closing paren as well, but that should give you a different error than one near "VALUES":

   (Email_Extra_2)))
0
lantervAuthor Commented:
Incorrect syntax near 'values'
0
Scott PletcherSenior DBACommented:
>> Compatibility level is 2005 (90) <<
As I stated earlier:
"If your db is set to use SQL 2005 level, the VALUES clause won't work directly."

Try running it from a different db and qualifying the db name on the tables names in the FROM clause.

Why on earth are you still in 2005 mode on a 2012 instance??
0
lantervAuthor Commented:
Good question.  And I will get an answer.
0
lantervAuthor Commented:
I can't find a reason but suspect it was a matter of migration over the years.  I can't find a good reason to not change it to level 110.
0
Scott PletcherSenior DBACommented:
Very likely that's all it is.  The db gets restored and no one even thinks about having to reset its compatibility level.
0
lantervAuthor Commented:
Well, I feel like an idiot.  I'm using a local copy of SSMS that's 2012 but the remote SQL server is actually 2005.  When will I ever learn?

I added an index of the email columns and the query actually runs now (instead of timing out).
0
PortletPaulfreelancerCommented:
Scott has proposed the use of CROSS APPLY with VALUES; this is an efficient method to "unpivot" data, but is version dependent. In earlier versions you can produce the same effect by using UNION ALL (it's not as efficient though)

select
      id as user_id
    , ca1.EmailAddress
from users
cross apply (
  
              select EMAIL_HOME
              union all
              select EMAIL_WORK
              union all
              select Email_Extra_1
              union all
              select Email_Extra_2
  
              ) AS ca1 (EmailAddress)
;

| USER_ID | EMAILADDRESS |
|---------|--------------|
|       1 |         x1@2 |
|       1 |         x1@3 |
|       1 |         x1@4 |
|       1 |         x1@5 |
|       2 |         x2@1 |
|       2 |         x2@2 |
|       2 |         x2@3 |
|       2 |         x2@4 |

CREATE TABLE Users
	([ID] int, [EMAIL_HOME] varchar(4), [EMAIL_WORK] varchar(4), [Email_Extra_1] varchar(4), [Email_Extra_2] varchar(4))
;
	
INSERT INTO Users
	([ID], [EMAIL_HOME], [EMAIL_WORK], [Email_Extra_1], [Email_Extra_2])
VALUES
	(1, 'x1@2', 'x1@3', 'x1@4', 'x1@5'),
	(2, 'x2@1', 'x2@2', 'x2@3', 'x2@4')
;

http://sqlfiddle.com/#!3/4bc4e/1

Open in new window

{+ edit} It may be worth trying this in lieu of the cross apply/value in Scott's query.

If not useful for that query, you may find this useful when normalizing that data in the future.
0
lantervAuthor Commented:
Indexing the users table on email addresses seemed to help performance the most.
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Microsoft SQL Server 2008

From novice to tech pro — start learning today.