SQL Server converting rows to columns T-SQL improvment required

Hi

I have 2 tables, Accounts, Emails. Each account row can have 0 to 3 emails. Is there a better way to rewrite the example below and can you provide some examples?

Many thanks
Andrew
SELECT     t0.FirstName, t0.LastName, e1.Email1, e2.Email2, e3.Email3
FROM         SCS_Accounts AS t0 LEFT OUTER JOIN
                          (SELECT     AccountId, Email AS Email1
                            FROM          SCS_Emails
                            WHERE      ([Order] = 1)) AS e1 ON t0.AccountID = e1.AccountId LEFT OUTER JOIN
                          (SELECT     AccountId, Email AS Email2
                            FROM          SCS_Emails
                            WHERE      ([Order] = 2)) AS e2 ON t0.AccountID = e2.AccountId LEFT OUTER JOIN
                          (SELECT     AccountId, Email AS Email3
                            FROM          SCS_Emails
                            WHERE      ([Order] = 3)) AS e3 ON t0.AccountID = e3.AccountId

Open in new window

j055Asked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Paul_Harris_FusionCommented:
The following will produce the same results.   Whether you think it is better or not is a matter of taste for sql styling.  Personally, I think it is a bit neater and clearer.

SELECT t0.FirstName, t0.LastName
 , Email1 = (Select Email from SCS_Emails where AccountId = SCS_Accounts.AccountId and [Order]=1)
 , Email2 = (Select Email from SCS_Emails where AccountId = SCS_Accounts.AccountId and [Order]=2)
 , Email3 = (Select Email from SCS_Emails where AccountId = SCS_Accounts.AccountId and [Order]=3)
FROM SCS_Accounts

In theory, you could use PIVOT functionality but this is really designed for aggregations and personally I find the syntax very unintuitive.
0
j055Author Commented:
Hi there

Your SQL is easier to read. Are there any performance implications? I want to add a few more columns  from other tables for my actual requirement. The idea is to create a standard view of the flattened data that other queries can use. The view could contain 1000's of rows and I wonder if performance is going to suffer?

Any ideas?
Andrew
0
Paul_Harris_FusionCommented:
I don't believe there is a performance hit since the joins are identical in both cases.   I think the optimiser would generate identical queries under the bonnet.   However,  I am never 100% sure with performance until I  run some comparitive tests.

There is a difference in behaviour in that my sql will error if there is more than 1 email for a specific user with the same order value.    Your sql would generate an extra row in the results.  


0
Cloud Class® Course: Microsoft Azure 2017

Azure has a changed a lot since it was originally introduce by adding new services and features. Do you know everything you need to about Azure? This course will teach you about the Azure App Service, monitoring and application insights, DevOps, and Team Services.

Anthony PerkinsCommented:
I suspect you may find using LEFT OUTER JOINs to be more performant as in:
SELECT  t0.FirstName,
        t0.LastName,
        e1.Email Email1,
        e2.Email Email2,
        e3.Email Email3
FROM    SCS_Accounts t0
        LEFT OUTER JOIN SCS_Emails e1 WHERE t0.AccountID = e1.AccountId AND e1.Order = 1
	LEFT OUTER JOIN SCS_Emails e2 WHERE t0.AccountID = e2.AccountId AND e2.Order = 2
	LEFT OUTER JOIN SCS_Emails e3 WHERE t0.AccountID = e3.AccountId AND e3.Order = 3

Open in new window

0
Anthony PerkinsCommented:
Let's try that again:
SELECT  t0.FirstName,
        t0.LastName,
        e1.Email Email1,
        e2.Email Email2,
        e3.Email Email3
FROM    SCS_Accounts t0
        LEFT OUTER JOIN SCS_Emails e1 ON t0.AccountID = e1.AccountId AND e1.[Order] = 1
        LEFT OUTER JOIN SCS_Emails e2 ON t0.AccountID = e2.AccountId AND e2.[Order] = 2
        LEFT OUTER JOIN SCS_Emails e3 ON t0.AccountID = e3.AccountId AND e3.[Order] = 3

Open in new window

0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Paul_Harris_FusionCommented:
acperkins - do you have any thoughts as to why the outer join might be quicker?   Functionally, the two approaches are identical and I would have thought that the generated query plan would be identical in both cases.   I would be interested to hear your reasoning and experiences of this.   Hope this doesn't sound argumentative - I am simply interested in learning.

My earlier comments about there being a slight behaviour difference also apply to the outer join approach. The subselect approach defends against duplicate data in the email table whereas any join approach would generate a duplicate person in the result set.

So performance is one issue,  the behaviour is another.
0
Paul_Harris_FusionCommented:
I think if we were retrieving multiple values from the joined tables then I could understand a performance advantage in the join approach.  Since we are retrieving a single value,  I can't see a reason why the sub-select would be any less efficient.
0
j055Author Commented:
Hi

I do need get multiple values from the email tables. I should have provided a better example in my original question, sorry. acperkins example looks good for purposes but I'm not sure how to avoid getting duplicate rows from each of the emails tables. Paul's example would throw errors if more than one email row is returned for each account. I guess I should be making sure I have suitable contraints on the email tables?

I've attached the execution plans for each query if that's any use? They are different. The execution time is about the save with my sample data.

Cheers
Andrew




StmtText
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------


SELECT     t0.FirstName, t0.LastName, e1.Email AS Email1, e2.Email AS Email2, e3.Email AS Email3
FROM       SCS_Accounts AS t0 
			LEFT OUTER JOIN SCS_Emails AS e1 ON t0.AccountID = e1.AccountId AND e1.[Order] = 1 AND e1.AllowLogin = 1 
			LEFT OUTE

(1 row(s) affected)

StmtText
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  |--Hash Match(Right Outer Join, HASH:([e3].[AccountId])=([t0].[AccountID]))
       |--Nested Loops(Inner Join, OUTER REFERENCES:([e3].[EmailId]))
       |    |--Index Seek(OBJECT:([dbo].[SCS_Emails].[<Name of Missing Index, sysname,>] AS [e3]), SEEK:([e3].[Order]=(3)) ORDERED FORWARD)
       |    |--Clustered Index Seek(OBJECT:([dbo].[SCS_Emails].[PK_SCS_Emails] AS [e3]), SEEK:([e3].[EmailId]=[dbo].[SCS_Emails].[EmailId] as [e3].[EmailId]),  WHERE:(CONVERT_IMPLICIT(tinyint,[dbo].[SCS_Emails].[AllowLog
       |--Hash Match(Right Outer Join, HASH:([e2].[AccountId])=([t0].[AccountID]))
            |--Clustered Index Scan(OBJECT:([dbo].[SCS_Emails].[PK_SCS_Emails] AS [e2]), WHERE:([dbo].[SCS_Emails].[Order] as [e2].[Order]=(2) AND CONVERT_IMPLICIT(tinyint,[dbo].[SCS_Emails].[AllowLogin] as [e2].[AllowLo
            |--Hash Match(Right Outer Join, HASH:([e1].[AccountId])=([t0].[AccountID]))
                 |--Clustered Index Scan(OBJECT:([dbo].[SCS_Emails].[PK_SCS_Emails] AS [e1]), WHERE:([dbo].[SCS_Emails].[Order] as [e1].[Order]=(1) AND CONVERT_IMPLICIT(tinyint,[dbo].[SCS_Emails].[AllowLogin] as [e1].[Al
                 |--Clustered Index Scan(OBJECT:([dbo].[SCS_Accounts].[PK_SCS_Accounts] AS [t0]))

(9 row(s) affected)

StmtText
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
	SELECT FirstName, LastName
 , Email1 = (Select Email from SCS_Emails where AccountId = SCS_Accounts.AccountId and [Order]=1) 
 , Email2 = (Select Email from SCS_Emails where AccountId = SCS_Accounts.AccountId and [Order]=2) 
 , Email3 = (Select Email f

(1 row(s) affected)

StmtText
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  |--Compute Scalar(DEFINE:([Expr1017]=[dbo].[SCS_Emails].[Email]))
       |--Merge Join(Left Outer Join, MERGE:([dbo].[SCS_Accounts].[AccountID])=([dbo].[SCS_Emails].[AccountId]), RESIDUAL:([dbo].[SCS_Emails].[AccountId]=[dbo].[SCS_Accounts].[AccountID]))
            |--Compute Scalar(DEFINE:([Expr1012]=[dbo].[SCS_Emails].[Email]))
            |    |--Merge Join(Left Outer Join, MERGE:([dbo].[SCS_Accounts].[AccountID])=([dbo].[SCS_Emails].[AccountId]), RESIDUAL:([dbo].[SCS_Emails].[AccountId]=[dbo].[SCS_Accounts].[AccountID]))
            |         |--Compute Scalar(DEFINE:([Expr1007]=[dbo].[SCS_Emails].[Email]))
            |         |    |--Merge Join(Right Outer Join, MERGE:([dbo].[SCS_Emails].[AccountId])=([dbo].[SCS_Accounts].[AccountID]), RESIDUAL:([dbo].[SCS_Emails].[AccountId]=[dbo].[SCS_Accounts].[AccountID])
            |         |         |--Sort(ORDER BY:([dbo].[SCS_Emails].[AccountId] ASC))
            |         |         |    |--Index Seek(OBJECT:([dbo].[SCS_Emails].[<Name of Missing Index, sysname,>]), SEEK:([dbo].[SCS_Emails].[Order]=(1)) ORDERED FORWARD)
            |         |         |--Clustered Index Scan(OBJECT:([dbo].[SCS_Accounts].[PK_SCS_Accounts]), ORDERED FORWARD)
            |         |--Sort(ORDER BY:([dbo].[SCS_Emails].[AccountId] ASC))
            |              |--Index Seek(OBJECT:([dbo].[SCS_Emails].[<Name of Missing Index, sysname,>]), SEEK:([dbo].[SCS_Emails].[Order]=(2)) ORDERED FORWARD)
            |--Nested Loops(Inner Join, OUTER REFERENCES:([dbo].[SCS_Emails].[EmailId]))
                 |--Index Scan(OBJECT:([dbo].[SCS_Emails].[IX_SCS_Emails_AccountID_Order]),  WHERE:([dbo].[SCS_Emails].[Order]=(3)) ORDERED FORWARD)
                 |--Clustered Index Seek(OBJECT:([dbo].[SCS_Emails].[PK_SCS_Emails]), SEEK:([dbo].[SCS_Emails].[EmailId]=[dbo].[SCS_Emails].[EmailId]) LOOKUP ORDERED FORWARD)

(14 row(s) affected)

Open in new window

0
Anthony PerkinsCommented:
>>Functionally, the two approaches are identical and I would have thought that the generated query plan would be identical in both cases. <<
I very much doubt that the queries are the same.  In general subqueries should be avoided where possible as you will have to execute a separate SQL statement for each row found in the SCS_Accounts and then multiply that by three.  If on the other hand you can build it with an OUTER join and assuming appropriately indexed you will end up with a single resultset and therefore it should be more performant.  But I suspect that you will only see a significant difference if the main table is larger than say 50K rows.

>> I would be interested to hear your reasoning and experiences of this.   Hope this doesn't sound argumentative - I am simply interested in learning.<<
Not at all.  We are all here to learn and I could well be wrong.
0
j055Author Commented:
This is the solution I'll use. I appreciate the discussion. Thanks to both of you.

Andrew
0
Paul_Harris_FusionCommented:
I am still unconvinced about any real difference between 3 outer joins and 3 sub-selects (containing the same constraints as the joined approach) when a single column is being retrieved through each join or select.  I think things would change if more than one column was being reteived through each join.

My impression of the requirement is that exactly one row per person shoud be generated with 3 email fields i.e. one field from each join or sub select.  

The 'exactly one row' requirement makes the sub-select approach a better option in my opinion since it will enforce this.

However, the join approach will work equally as well - just make sure that there is a unique composite index on columns (AccountId, Order) in SCS_Accounts.   This will avoid the duplicates.   If you can't do this for some reason then you definitely need to plan for handling the duplicate issue.   Maybe by using the sub-select approach and selecting min (EMail) rather then EMail in the sub-selects.   This would be a bit of a hack though - better to sort out the data model and put the constraints in place.




0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Microsoft SQL Server 2008

From novice to tech pro — start learning today.