SQL Query: Store calculation to be used later in query

I'm building a report for a client which has millions of records so performance is key.

Column1 (AvgDailyBalance - line 83): The requirements requested include a column which is a calculation of amounts ($$$).
Column2 (InterestExpense): The next column asks to multiply the first column by a constant number. For the sake of performance, I don't want to run my calculations for 1st column again. I was thinking of somehow storing the results of the orginal calculations to get the amount for column1, and then using the stored results for Column2.

Not sure how to accomplish this.

DECLARE @startDate datetime = '4/1/2012'        
DECLARE @endDate datetime = '12/31/2012' 
DECLARE @reportType VARCHAR(50) = 'Agent'
DECLARE @monthlyCostPerAccount MONEY = 25.00
DECLARE @costOfFunds MONEY = 5.00

SELECT DISTINCT 
	act.GlAccountTypeId, act.Name AS AccountType, at.GlAbstractTypeId, at.Name AS AbstractType, amt.GlAccountingMapTypeId, 
	amt.Name AS AccountingMapType, a.GlAccountId, a.AccountName, ammt.GlAccountingMapModifierTypeId, ammt.Name AS GlAccountingMapModifierType
INTO 
	#tmpMap
FROM  
	(
		SELECT DISTINCT 
			GlAccountingMapTypeId, GlAbstractTypeId, GlAccountId, GlAccountingMapModifierTypeId
		FROM 
			GlAccountingMap
	) am
	JOIN GlAbstractType at ON am.GlAbstractTypeId = at.GlAbstractTypeId
	JOIN GlAccount a ON am.GlAccountId = a.GlAccountId
	JOIN GlAccountType act ON a.GlAccountTypeId = act.GlAccountTypeId
	JOIN GlAccountingMapType amt ON am.GlAccountingMapTypeId = amt.GlAccountingMapTypeId
	JOIN GlAccountingMapModifierType ammt ON am.GlAccountingMapModifierTypeId = ammt.GlAccountingMapModifierTypeId

ORDER BY amt.GlAccountingMapTypeId, at.GlAbstractTypeId, act.GlAccountTypeId, a.GlAccountId, ammt.GlAccountingMapModifierTypeId



;WITH
cteTally10 AS (
		SELECT 0 AS tally UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL
		SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9
),
cteTally100 AS (
		SELECT ROW_NUMBER() OVER(ORDER BY c1.tally) - 1 AS tally
		FROM cteTally10 c1
		CROSS JOIN cteTally10 c2
),
cteEntities AS (
		SELECT DISTINCT PartyId
		FROM Entity
)

SELECT 
	RT.*, 
	CONVERT(VARCHAR,DATEPART(MONTH,RT.PeriodDate))+'/'+CONVERT(VARCHAR,DATEPART(YEAR,RT.PeriodDate)) as Period,
	DT.*,
	RF.ReferralAmount,
	dtGL.InterestIncome, dtGL.FeeIncome, dtGLWO.NETWriteOffs ,
--	dtAL.NumberAccounts, dtAl.acctnum
	(
	 (
		(SELECT
			SUM(ent.Amount) as SumBeforeStartPeriod
		 FROM Entity en
		 left join CustomerAccount CuAc on en.PartyId = (CASE WHEN @reportType = 'Agent' THEN CuAc.AgentId
															  WHEN @reportType = 'Submitted By' THEN CuAc.SubmittedById
														 END)
		 left join GlTransaction t on t.PartyId = CuAc.PartyId
		 left join GlEntry ent on ent.GlTransactionId = t.GlTransactionId
		 left join GlAccount a on a.GlAccountId = ent.GlAccountId
		 WHERE 
			en.PartyId = RT.PartyId
			AND t.WhenBooked < RT.PeriodDate 
			AND a.AccountName = 'Accounts Receivable - Premiums'
		) +
		
		(SELECT
			SUM(ent.Amount) as SumBeforeEndPeriod
		 FROM Entity en
		 left join CustomerAccount CuAc on en.PartyId = (CASE WHEN @reportType = 'Agent' THEN CuAc.AgentId
															  WHEN @reportType = 'Submitted By' THEN CuAc.SubmittedById
														 END)
		 left join GlTransaction t on t.PartyId = CuAc.PartyId
		 left join GlEntry ent on ent.GlTransactionId = t.GlTransactionId
		 left join GlAccount a on a.GlAccountId = ent.GlAccountId
		 WHERE 
			en.PartyId = RT.PartyId
			AND t.WhenBooked < DATEADD(MONTH, DATEDIFF(MONTH, 0, RT.PeriodDate) + 1, 0)
			AND a.AccountName = 'Accounts Receivable - Premiums'
		)
	 ) / 2 
	) * -1 as AvgDailyBalance,
	
	@costOfFunds / 100 as CostofFunds,
	
	
	
	
	
	
FROM
(
	SELECT 
		c.PartyId, DATEADD(MONTH, DATEDIFF(MONTH, 0, @startDate) + t.tally, 0) as PeriodDate
	FROM 
		cteTally100 t, cteEntities c
	WHERE 
		t.tally BETWEEN 0 AND DATEDIFF(MONTH, @startDate, @endDate)
) RT

--LEFT JOIN CustomerAccount CustAccount ON RT.PartyId = CustAccount.PartyId

LEFT JOIN 
( ..
...
) DT ON DT.PartyId = RT.PartyId AND 
  DT.createMonth >=  RT.PeriodDate AND
  DT.createMonth < DATEADD(MONTH, DATEDIFF(MONTH, 0, RT.PeriodDate) + 1, 0)
...
...
DROP TABLE #tmpMap 

Open in new window

LVL 8
pzozulkaAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

nemws1Database AdministratorCommented:
In SQL Server you have separate out queries that assign values to variables from queries that request values from variables.  If you try to do so, you'll get:
A SELECT statement that assigns a value to a variable must not be combined with data-retrieval operations.

Open in new window

However, MSSQL is pretty smart.  I would think it does *NOT* do the calculation twice, even though you have that in your query.  It'll only calculate it once and recognize that it's used later on in the same row and use the cached value.  I know this is true when you have something like:
SELECT CONVERT(somefield INTO VARCHAR(40))
FROM sometable
ORDER BY CONVERT(somefield INTO VARCHAR(40))

Open in new window

It only does the convert *once*, even though you're using it in the ORDER BY clause.
0
PortletPaulfreelancerCommented:
One method for reuse of a calculation is to use APPLY (either CROSS APPLY or OUTER APPLY depending on need) e.g.
select


      -- we can reference by alias calculations in the apply below

      substring(astring, pos1, len1)


from (
              select 'http://twitter.com/' as astring
      ) as data

-- this cross apply gives us 2 values in: pos1 & len1

cross apply (
              select charindex(':',astring), length(astring)
             ) as ca1 (pos1, len1)

Open in new window

BUT:
You are using 2 correlated queries in the select clause for your calculations, this is likely to be quite (very?) inefficient particularly as there are many joins involved in each one.

You are also using left joins in both of those correlated queries, but these are useless because of your where clause. For example if a.AccountName = 'Accounts Receivable - Premiums' is necessary then a.AccountName cannot be NULL, and if a.AccountName cannot be NULL you have the same condition as an INNER JOIN. This is true of t.WhenBooked also, it cannot be NULL either. Don't use less efficient LEFT JOINS if you don't have to.
SELECT
			SUM(ent.Amount) as SumBeforeEndPeriod
		 FROM Entity en
		 left join CustomerAccount CuAc on en.PartyId = (CASE WHEN @reportType = 'Agent' THEN CuAc.AgentId
															  WHEN @reportType = 'Submitted By' THEN CuAc.SubmittedById
														 END)
		 left join GlTransaction t on t.PartyId = CuAc.PartyId                 --<< change join
		 left join GlEntry ent on ent.GlTransactionId = t.GlTransactionId      --<< change join
		 left join GlAccount a on a.GlAccountId = ent.GlAccountId              --<< change join
		 WHERE 
			en.PartyId = RT.PartyId

			AND t.WhenBooked < DATEADD(MONTH, DATEDIFF(MONTH, 0, RT.PeriodDate) + 1, 0)
-- cannot be NULL, so inner join t

			AND a.AccountName = 'Accounts Receivable - Premiums'
-- cannot be NULL, so inner join a

Open in new window

But, I would be avoiding these correlated subqueries
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
pzozulkaAuthor Commented:
nemws1: I see what you're saying, but don't know if this applies to our case. Either way, I need to write an optimized query.

PortletPaul:
Thanks. I changed them to inner joins. I really don't see a way of avoiding the correlated subqueries. Perhaps I can combine them into a single subquery instead of adding two of them, but don't see a way of avoiding using them altogether.

Not seeing how the cross apply would work in my situation. Are you suggesting to basically put the subquery into the cross apply, and then select things inside it from the outside? If so, will RT.PartyId be reachable from within the Cross Apply? I ask because if you look at the query posted in my post above, line 104 includes a LEFT JOIN. RT.PartyId is not visible from inside that LEFT JOIN. I'm not sure why, but for the same reason have a feeling that it won't be visible from inside the CROSS APPLY.
0
Ultimate Tool Kit for Technology Solution Provider

Broken down into practical pointers and step-by-step instructions, the IT Service Excellence Tool Kit delivers expert advice for technology solution providers. Get your free copy now.

PortletPaulfreelancerCommented:
This question is about how to reuse a calculation, there are several ways of course, Here an APPLY would seem the most likely (simple) candidate. You may need an OUTER APPLY which will allow NULLs (like an outer join does).

But, I cannot be certain it will work for you, as I cannot test it.

However, I'm afraid I just cannot agree that there is no alternative to the correlated subqueries. Here is a suggestion:
LEFT JOIN (
            SELECT
                  en.PartyId
                , SUM(CASE
                        WHEN t.WhenBooked < RT.PeriodDate THEN ent.Amount END) AS SumBeforeStartPeriod
                , SUM(ent.Amount)                                              AS SumBeforeEndPeriod
            FROM Entity en
                  LEFT JOIN CustomerAccount CuAc
                              ON en.PartyId = (CASE
                                          WHEN @reportType = 'Agent' THEN CuAc.AgentId
                                          WHEN @reportType = 'Submitted By' THEN CuAc.SubmittedById END)
                  INNER JOIN GlTransaction t
                              ON t.PartyId = CuAc.PartyId
                  INNER JOIN GlEntry ent
                              ON ent.GlTransactionId = t.GlTransactionId
                  INNER JOIN GlAccount a
                              ON a.GlAccountId = ent.GlAccountId
            
            WHERE a.AccountName = 'Accounts Receivable - Premiums'
                  AND t.WhenBooked < DATEADD(MONTH, DATEDIFF(MONTH, 0, RT.PeriodDate) + 1, 0)
            
            GROUP BY
                  en.PartyId

         ) as data ON RT.PartyId = data.PartyId

Open in new window

This approach is going to be much more efficient and will probably also solve the reuse of calculation issue as well.

{+edit} I see I have overlooked the RT.PeriodDate issue in the above - sorry. However I woud still be looking for ways to avoid those correlated subqueries.
0
pzozulkaAuthor Commented:
Can you tell me or provide links to explain why correlated subqueries are not good, and why a LEFT JOIN would be better?

I just ran my query with the correlated subquery included -- took 6:22 seconds.
Commented out the correlated subquery -- took 2 seconds.

I'm eager to learn why correlated subqueries are so bad.
0
pzozulkaAuthor Commented:
Should I open a new question about replacing the correlated subqueries. This query crashed my server. :)

I need another way.
0
pzozulkaAuthor Commented:
I have already tried using your code suggestion in post ID: 40275122 and it will not work because it does not see RT.___ fields within the LEFT JOIN.
0
PortletPaulfreelancerCommented:
yes I did mention that in an edit, but moving from correlated subqueries can be achieved I'm sure
0
PortletPaulfreelancerCommented:
>>"should I open a new question about replacing the correlated subqueries. "
yes, I think that would be a good idea
0
PortletPaulfreelancerCommented:
>>"I'm eager to learn why correlated subqueries are so bad."

I don't know of a good reference, or if I can explain it well. I am particularly concerned by correlated subqueries IN THE SELECT CLAUSE

The Select clause is one of the last to be evaluated, that order is:
1: FROM
2: the JOINS
3: WHERE clause
4: GROUP BY clause
5: HAVING clause
6: SELECT clause
7: ORDER BY clause

So correlations in step 6 forces evaluation of factors that should have been considered in steps 1, 2 & 3
- it goes backwards!

Not only that, it goes backwards, PER RECORD! Those queries are evaluated literally, row by row in the resultset, it's sort of similar to looping through a cursor - and it just isn't efficient.

So, try to move the gathering of information into steps 1&2

-------
There is one exception to make for correlated subqueries:
If using EXISTS or NOT EXISTS in the where clause, these require correlated subqueries, but as a general rule these correlated subqueries work  well.

i.e. While most correlated subqueries are things to avoid its different for EXISTS/NOT EXISTS
0
pzozulkaAuthor Commented:
Very well explained. Thank you.
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Microsoft SQL Server

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.