Improve company productivity with a Business Account.Sign Up

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 356
  • Last Modified:

SQL query optimization

We have an existing query, and I need to add a new field to it called EffectiveDate.

SELECT CA.AccountNumber, CA.Name, Agent.Name, CA.InstallmentsMadeCount,
 ( SELECT TOP 1 ip.PolicyNumber
                FROM [InsurancePolicy] ip
                JOIN [PremiumFinanceLoan] ON ip.[PremiumFinanceLoanId] = [PremiumFinanceLoan].[PremiumFinanceLoanId]
                where [PremiumFinanceLoan].[CustomerAccountId] =ca.[PartyId]
                order by ip.InsurancePolicyId 
        ) AS PolicyNumber
FROM CustomerAccount as CA
INNER JOIN Entity AS Agent ON Agent.PartyId = CA.AgentId
LEFT OUTER JOIN [Entity] s ON CA.[SubmittedById] = s.[PartyId]
LEFT JOIN [Address] ON ([Address].PartyId = CA.PartyId AND [Address].IsPrimary = 1) 
LEFT JOIN Region ON ([Address].RegionId = Region.RegionId ) 

Open in new window

The code to add my new field is:
( SELECT TOP 1 ip.EffectiveDate
                    FROM [InsurancePolicy] ip
                    JOIN [PremiumFinanceLoan] ON ip.[PremiumFinanceLoanId] = [PremiumFinanceLoan].[PremiumFinanceLoanId]
                    where [PremiumFinanceLoan].[CustomerAccountId] =ca.[PartyId]
                    order by ip.InsurancePolicyId 
        ) AS EffectiveDate

Open in new window

However, it would not be efficient to run these two subqueries separately, especially since PolicyNumber and EffectiveDate are pulled from the same exact tables. Also, using TOP 1 is not efficient either.

How would you re-write this query for maximum efficiency/performance?

I'm thinking of JOINing to a subquery that selects both fields, and instead of using TOP 1, use MIN to find the smallest InsurancePolicyId, and use that in WHERE clause of subquery.
0
pzozulka
Asked:
pzozulka
  • 8
  • 5
  • 2
  • +1
3 Solutions
 
DultonCommented:
use row_number to order them in a cte, then join to the cte on the same customer account id field and where the order = 1
;With PolicyOrder AS
(
SELECT ip.PolicyNumber
, ip.EffectiveDate
, pfl.CustomerAccountId
, ROW_NUMBER() OVER(PARTITION BY pfl.CustomerAccountId ORDER BY ip.InsurancePolicyId) AS [PolicyRowOrder]
FROM InsurancePolicy AS ip
INNER JOIN PremiumFinanceLoad AS pfl
ON ip.PremiumFinanceLoanId = pfl.PremiumFinanceLoanId

)
SELECT CA.AccountNumber, CA.Name, Agent.Name, CA.InstallmentsMadeCount
,po.PolicyNumber
,po.EffectiveDate
FROM CustomerAccount as CA
INNER JOIN Entity AS Agent ON Agent.PartyId = CA.AgentId
LEFT OUTER JOIN [Entity] s ON CA.[SubmittedById] = s.[PartyId]
LEFT JOIN [Address] ON ([Address].PartyId = CA.PartyId AND [Address].IsPrimary = 1) 
LEFT JOIN Region ON ([Address].RegionId = Region.RegionId )
LEFT JOIN PolicyOrder AS po 
ON  ca.[PartyId] = po.[CustomerAccountId]
AND po.PolicytRowOrder = 1

Open in new window

0
 
PaulCommented:
no points pl.
I too would suggest row_number()

is there any particular reason for choosing the first policy based on the lowest InsurancePolicyId?

with row_number() you have the ability to order by - for example - that date field  
(see line 6 above, there is an order by in that line)

also note it isn't necessary to use a CTE, a derived table could be used instead
LEFT JOIN (
        SELECT ip.PolicyNumber
        , ip.EffectiveDate
        , pfl.CustomerAccountId
        , ROW_NUMBER() OVER(PARTITION BY pfl.CustomerAccountId ORDER BY ip.InsurancePolicyId) AS [PolicyRowOrder]
        FROM InsurancePolicy AS ip
        INNER JOIN PremiumFinanceLoad AS pfl
        ON ip.PremiumFinanceLoanId = pfl.PremiumFinanceLoanId
) AS po 
ON  ca.[PartyId] = po.[CustomerAccountId]
AND po.PolicytRowOrder = 1

Open in new window

0
 
pzozulkaAuthor Commented:
is there any particular reason for choosing the first policy based on the lowest InsurancePolicyId?

Because InsurancePolicyId is a primary key, it would mean that the Policy Effective Date would also be the smallest if we used MIN(InsurancePolicyId).

The specs for this report say the new field should display the Policy Effective Date of the first Policy on the account. Since there can be multiple policies, which start on different dates, we would need to get the first one.
0
What Kind of Coding Program is Right for You?

There are many ways to learn to code these days. From coding bootcamps like Flatiron School to online courses to totally free beginner resources. The best way to learn to code depends on many factors, but the most important one is you. See what course is best for you.

 
PaulCommented:
>>"The specs for this report say the new field should display the Policy Effective Date of the first Policy on the account. "

Dulton's query should do exactly that
0
 
pzozulkaAuthor Commented:
I tested both suggestions against my original approach to simply add a field by using the below subquery in the SELECT clause:
( SELECT TOP 1 ip.EffectiveDate
                    FROM [InsurancePolicy] ip
                    JOIN [PremiumFinanceLoan] ON ip.[PremiumFinanceLoanId] = [PremiumFinanceLoan].[PremiumFinanceLoanId]
                    where [PremiumFinanceLoan].[CustomerAccountId] =ca.[PartyId]
                    order by ip.InsurancePolicyId 
        ) AS EffectiveDate

Open in new window

I tested by running both queries at the same time and enabling the output of "Actual Execution Plan". It shows that my query cost took 35% of the batch, while your suggestion cost 65%.

I'm a bit surprised at these results as it wouldn't make sense to run a subquery twice.
0
 
Scott PletcherSenior DBACommented:
1) I suggest using OUTER APPLY, as below.
2) You can't rely on the % estimates.  I've seen wildly wrong estimates there.
3) TOP (1) is not inherently inefficient.


SELECT CA.AccountNumber, CA.Name, Agent.Name, CA.InstallmentsMadeCount,
    ip_data.PolicyNumber, ip_data.EffectiveDate
FROM CustomerAccount as CA
INNER JOIN Entity AS Agent ON Agent.PartyId = CA.AgentId
LEFT OUTER JOIN [Entity] s ON CA.[SubmittedById] = s.[PartyId]
LEFT JOIN [Address] ON ([Address].PartyId = CA.PartyId AND [Address].IsPrimary = 1)
LEFT JOIN Region ON ([Address].RegionId = Region.RegionId )  
OUTER APPLY
 ( SELECT TOP (1) ip.PolicyNumber, ip.EffectiveDate
                FROM [InsurancePolicy] ip
                JOIN [PremiumFinanceLoan] ON ip.[PremiumFinanceLoanId] = [PremiumFinanceLoan].[PremiumFinanceLoanId]
                where [PremiumFinanceLoan].[CustomerAccountId] =ca.[PartyId]
                order by ip.InsurancePolicyId
        ) AS ip_data
0
 
pzozulkaAuthor Commented:
ScottPletcher: Using your query the result set went from 132 rows to 637 rows.
0
 
Scott PletcherSenior DBACommented:
Try CROSS APPLY instead of OUTER APPLY.  (I guess.  I don't see why another OUTER join would add rows; are the WHERE conditions the same on both queries?)


SELECT CA.AccountNumber, CA.Name, Agent.Name, CA.InstallmentsMadeCount,
...
CROSS APPLY
 ( SELECT TOP (1) ip.PolicyNumber, ip.EffectiveDate
    ...
0
 
pzozulkaAuthor Commented:
Same effect.
0
 
DultonCommented:
-Hold the press.

you have a bunch of left/outer joins.... but none of the outer tables are being returned in the result set.....

that makes all of those joins un-necessary.

see if this performs any better, and gives the right results.

;With  PolicyOrder AS
(
SELECT ip.PolicyNumber
, ip.EffectiveDate
, pfl.CustomerAccountId
, ROW_NUMBER() OVER(PARTITION BY pfl.CustomerAccountId ORDER BY ip.InsurancePolicyId) AS [PolicyRowOrder]
FROM InsurancePolicy AS ip
INNER JOIN PremiumFinanceLoad AS pfl
ON ip.PremiumFinanceLoanId = pfl.PremiumFinanceLoanId
)
SELECT CA.AccountNumber
, CA.Name
, Agent.Name
, CA.InstallmentsMadeCount
,po.PolicyNumber
,po.EffectiveDate
FROM CustomerAccount as CA
INNER JOIN Entity AS Agent ON Agent.PartyId = CA.AgentId
LEFT JOIN PolicyOrder AS po
ON  ca.[PartyId] = po.[CustomerAccountId]
AND po.PolicytRowOrder = 1

Open in new window

0
 
pzozulkaAuthor Commented:
The outer tables are being returned in the result set, but I did not paste the entire query, just the essentials needed to answer my question.
0
 
pzozulkaAuthor Commented:
Can someone also explain why ca.[PartyId] in the query below returns the error below, while the same thing in the CROSS APPLY subquery seems to be just fine?
The multi-part identifier "ca.PartyId" could not be bound.

LEFT JOIN (
      SELECT MIN(ip.InsurancePolicyId) as InsurancePolicyId, [PremiumFinanceLoan].CustomerAccountId
                  FROM [InsurancePolicy] ip
                JOIN [PremiumFinanceLoan] ON ip.[PremiumFinanceLoanId] = [PremiumFinanceLoan].[PremiumFinanceLoanId]
            WHERE [PremiumFinanceLoan].[CustomerAccountId] =ca.[PartyId]
                  GROUP BY [PremiumFinanceLoan].CustomerAccountId
              ) AS InsPolTb2 ON whatever = whatever

CROSS APPLY
 ( SELECT TOP (1) ip.PolicyNumber, ip.EffectiveDate
                FROM [InsurancePolicy] ip
                JOIN [PremiumFinanceLoan] ON ip.[PremiumFinanceLoanId] = [PremiumFinanceLoan].[PremiumFinanceLoanId]
                where [PremiumFinanceLoan].[CustomerAccountId] =ca.[PartyId]
                order by ip.InsurancePolicyId
        ) AS InsPolTb
0
 
Scott PletcherSenior DBACommented:
>> ScottPletcher: Using your query the result set went from 132 rows to 637 rows. <<

Still seems very odd.  Just adding an OUTER APPLY should not increase the number of rows for this query.
0
 
pzozulkaAuthor Commented:
Here's the entire FROM clause below -- not leaving anything out:
   ...     , ip_data.EffectiveDate
        , ip_data.PolicyNumber
        
FROM CustomerAccount as CA
INNER JOIN Entity AS Agent ON Agent.PartyId = CA.AgentId
LEFT OUTER JOIN [Entity] s ON CA.[SubmittedById] = s.[PartyId]
LEFT JOIN [Address] ON ([Address].PartyId = CA.PartyId AND [Address].IsPrimary = 1) 
LEFT JOIN Region ON ([Address].RegionId = Region.RegionId )
CROSS APPLY 
 ( SELECT TOP (1) ip.PolicyNumber, ip.EffectiveDate
                FROM [InsurancePolicy] ip
                JOIN [PremiumFinanceLoan] ON ip.[PremiumFinanceLoanId] = [PremiumFinanceLoan].[PremiumFinanceLoanId]
                where [PremiumFinanceLoan].[CustomerAccountId] =ca.[PartyId]
                order by ip.InsurancePolicyId 
        ) AS ip_data

Open in new window

0
 
Scott PletcherSenior DBACommented:
So running the query w/o the CROSS APPLY only returns 132 rows, but with the CA it returns 637 rows?  

So can the same PartyID appear multiple times in the outer query?
0
 
pzozulkaAuthor Commented:
I'm sorry, I accidentally switched to a different database. CROSS APPLY works fine, and according to execution plan costs 49% while original query costs 51%. You said not to trust this.

In that case, what do you recommend I use to check performance metrics? The database I'm using in dev environment does not allow me to check how long the execution time is as both execute in 0 seconds.

The production database is HUGE.
0
 
Scott PletcherSenior DBACommented:
Look at the Estimated Query Plan.  SQL can generate that in a second or so, no matter how large the tables are.

Presumably the CROSS APPLY plan should have a single scan (at most) of the underlying table, with a lazy spool.

The one with two different TOP (1) subqueries should show multiple lookups/scans of the underlying table: SQL could implement that logic different ways, would have to see it to be sure.
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Featured Post

What Kind of Coding Program is Right for You?

There are many ways to learn to code these days. From coding bootcamps like Flatiron School to online courses to totally free beginner resources. The best way to learn to code depends on many factors, but the most important one is you. See what course is best for you.

  • 8
  • 5
  • 2
  • +1
Tackle projects and never again get stuck behind a technical roadblock.
Join Now