lanterv
asked on
SQL query speed improvement
I have the following query ( a view named QBounceUsers);
I was asked to improve the performance. Sometimes it will timeout.
There 34 rows in Admin_Mass_Email_Addresses
There are 33,400 rows in bouncelog
There are 133,000 rows in Users and Users_Info
There are 60 rows in admin_users
That's not many rows at all. What can I do to speed up the process that takes over 10 minutes to complete?
I'm not sure why the joins are setup the way they are, but the result I want is to query that query and get a count grouped by ;
SELECT b.id, b.sPopAccount, b.sEmailAddress, b.sBounceType, b.sDate, b.sProcessed, b.todelete, b.date_added, u.ID AS userid, M.Description, M.Email AS fromemail
FROM dbo.Admin_Mass_Email_Addresses AS M RIGHT OUTER JOIN
dbo.bouncelog AS b ON M.Description = b.sPopAccount AND b.sEmailAddress > '' INNER JOIN
dbo.Admin_Users AS A ON M.adminID = A.adminID LEFT OUTER JOIN
dbo.Users AS u ON u.EMAIL_HOME = b.sEmailAddress OR u.EMAIL_WORK = b.sEmailAddress OR u.Email_Extra_1 = b.sEmailAddress OR
u.Email_Extra_2 = b.sEmailAddress LEFT OUTER JOIN
dbo.Users_Info AS ui ON u.ID = ui.ID AND ui.Status_Flag = 'A'
WHERE (u.EMAIL_HOME > '') AND (b.sBounceType = 'HB' OR
b.sBounceType = 'SB' OR
b.sBounceType = 'GB') OR
(u.EMAIL_WORK > '') AND (b.sBounceType = 'HB' OR
b.sBounceType = 'SB' OR
b.sBounceType = 'GB') OR
(u.Email_Extra_1 > '') AND (b.sBounceType = 'HB' OR
b.sBounceType = 'SB' OR
b.sBounceType = 'GB') OR
(u.Email_Extra_2 > '') AND (b.sBounceType = 'HB' OR
b.sBounceType = 'SB' OR
b.sBounceType = 'GB')
I was asked to improve the performance. Sometimes it will timeout.
There 34 rows in Admin_Mass_Email_Addresses
There are 33,400 rows in bouncelog
There are 133,000 rows in Users and Users_Info
There are 60 rows in admin_users
That's not many rows at all. What can I do to speed up the process that takes over 10 minutes to complete?
I'm not sure why the joins are setup the way they are, but the result I want is to query that query and get a count grouped by ;
SELECT COUNT(id) AS emailcount, Description, sPopAccount
FROM dbo.QBounceUsers AS b
WHERE (todelete IS NULL) AND (sBounceType = 'HB' OR
sBounceType = 'SB' OR
sBounceType = 'GB') AND (date_added > '02/17/2014 12:00:00 AM') AND (sEmailAddress <> '') AND (Status_Flag = 'A')
GROUP BY Description, sPopAccount
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Well, I've tested with all those combinations but still it will timeout.
This is the code I ended up with;
What schema questions can I answer?
This is the code I ended up with;
SELECT b.id
, b.sPopAccount
, b.sEmailAddress
, b.sBounceType
, b.sDate
, b.sProcessed
, b.todelete
, b.date_added
, u.ID AS userid
, M.Description
, M.Email AS fromemail
FROM dbo.bouncelog AS b INNER JOIN
dbo.Admin_Mass_Email_Addresses AS M ON M.Description = b.sPopAccount INNER JOIN
dbo.Admin_Users AS A ON M.adminID = A.adminID INNER JOIN
dbo.Users AS u ON b.sEmailAddress IN (u.EMAIL_HOME
, u.EMAIL_WORK
, u.Email_Extra_1
, u.Email_Extra_2)
INNER JOIN dbo.Users_Info AS ui ON u.ID = ui.ID AND ui.Status_Flag = 'A'
WHERE (b.sBounceType IN ('HB', 'SB', 'GB')) AND (b.sEmailAddress > '')
What schema questions can I answer?
ASKER
The users table has one primary key, and integer auto-increment id column. And there is one clustered index on that key. I was considering indexing on first and last name and possibly email addresses. Any thoughts?
Why not provide us with an execution plan? (.sqlplan file)
http://msdn.microsoft.com/en-us/library/ms190646.aspx
An execution plan it the first thing to look at for potential optimizations.
http://msdn.microsoft.com/en-us/library/ms190646.aspx
An execution plan it the first thing to look at for potential optimizations.
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Here you go.
ASKER
ScottPletcher: I agree, but I'm unable to do that just now. You should see the rest of the DB.
ASKER
SELECT b.id, b.sPopAccount, b.sEmailAddress, b.sBounceType, b.sDate, b.sProcessed, b.todelete, b.date_added, u.ID AS userid, M.Description, M.Email AS fromemail
FROM dbo.bouncelog AS b INNER JOIN
dbo.Admin_Mass_Email_Addresses AS M ON M.Description = b.sPopAccount INNER JOIN
dbo.Admin_Users AS A ON M.adminID = A.adminID INNER JOIN
(SELECT Users.ID, Users_Email.EmailAddress
FROM dbo.Users CROSS APPLY(VALUES ((EMAIL_HOME),
(EMAIL_WORK),
(Email_Extra_1),
(Email_Extra_2))
AS Users_Email(EmailAddress)
WHERE Users_Email.EmailAddress > '') AS u ON u.EmailAddress = b.sEmailAddress LEFT OUTER JOIN
dbo.Users_Info AS ui ON u.ID = ui.ID AND ui.Status_Flag = 'A'
WHERE (b.sBounceType IN ('HB', 'SB', 'GB')) AND (b.sEmailAddress > '')
Error near VALUES
This is my first attempt at using CROSS APPLY
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
It's SQL Server 2012
ASKER
Compatibility level is 2005 (90)
Oh wait, you're missing a closing paren as well, but that should give you a different error than one near "VALUES":
(Email_Extra_2)))
(Email_Extra_2)))
ASKER
Incorrect syntax near 'values'
>> Compatibility level is 2005 (90) <<
As I stated earlier:
"If your db is set to use SQL 2005 level, the VALUES clause won't work directly."
Try running it from a different db and qualifying the db name on the tables names in the FROM clause.
Why on earth are you still in 2005 mode on a 2012 instance??
As I stated earlier:
"If your db is set to use SQL 2005 level, the VALUES clause won't work directly."
Try running it from a different db and qualifying the db name on the tables names in the FROM clause.
Why on earth are you still in 2005 mode on a 2012 instance??
ASKER
Good question. And I will get an answer.
ASKER
I can't find a reason but suspect it was a matter of migration over the years. I can't find a good reason to not change it to level 110.
Very likely that's all it is. The db gets restored and no one even thinks about having to reset its compatibility level.
ASKER
Well, I feel like an idiot. I'm using a local copy of SSMS that's 2012 but the remote SQL server is actually 2005. When will I ever learn?
I added an index of the email columns and the query actually runs now (instead of timing out).
I added an index of the email columns and the query actually runs now (instead of timing out).
Scott has proposed the use of CROSS APPLY with VALUES; this is an efficient method to "unpivot" data, but is version dependent. In earlier versions you can produce the same effect by using UNION ALL (it's not as efficient though)
If not useful for that query, you may find this useful when normalizing that data in the future.
select
id as user_id
, ca1.EmailAddress
from users
cross apply (
select EMAIL_HOME
union all
select EMAIL_WORK
union all
select Email_Extra_1
union all
select Email_Extra_2
) AS ca1 (EmailAddress)
;
| USER_ID | EMAILADDRESS |
|---------|--------------|
| 1 | x1@2 |
| 1 | x1@3 |
| 1 | x1@4 |
| 1 | x1@5 |
| 2 | x2@1 |
| 2 | x2@2 |
| 2 | x2@3 |
| 2 | x2@4 |
CREATE TABLE Users
([ID] int, [EMAIL_HOME] varchar(4), [EMAIL_WORK] varchar(4), [Email_Extra_1] varchar(4), [Email_Extra_2] varchar(4))
;
INSERT INTO Users
([ID], [EMAIL_HOME], [EMAIL_WORK], [Email_Extra_1], [Email_Extra_2])
VALUES
(1, 'x1@2', 'x1@3', 'x1@4', 'x1@5'),
(2, 'x2@1', 'x2@2', 'x2@3', 'x2@4')
;
http://sqlfiddle.com/#!3/4bc4e/1
{+ edit} It may be worth trying this in lieu of the cross apply/value in Scott's query.If not useful for that query, you may find this useful when normalizing that data in the future.
ASKER
Indexing the users table on email addresses seemed to help performance the most.
Open in new window