We help IT Professionals succeed at work.

We've partnered with Certified Experts, Carl Webster and Richard Faulkner, to bring you two Citrix podcasts. Learn about 2020 trends and get answers to your biggest Citrix questions!Listen Now

x

Slow running query

plq
plq asked
on
Medium Priority
501 Views
Last Modified: 2010-07-27
Please don't comment on this question unless you have excellent knowledge of sql server query plans and query analysis. This seems to be a query optimiser problem.


This select into a temp table, and then use the temp table in a join...


Select a.WidgetID, Sum(a.Amount * (b.Sign)) As WidgetValue Into #temp From  WidgetTransaction As a inner JOIN TransactionType As b On a.TransactionTypeID = b.TransactionTypeID inner JOIN Widget As aass On a.WidgetID = aass.WidgetID inner JOIN CurrentPeriod As cp On aass.CompanyID = cp.CompanyID  Where a.BookID = 1 And b.TransactionTypeClass In ('ADDITION') And cp.UserID = 'ADMIN' And (a.ApplyDate <= cp.EndDate  OR  (a.TransactionTypeID = 1 And aass.DatePurchased <= cp.EndDate)) And a.BudgetID > 0 Group By a.WidgetID

Select  top 61 a.WidgetID,a.WidgetRef,a.Barcode,a.WidgetDesc,a.DatePurchased
,atc3.WidgetValue As CurrencyOriginalValue
From  Widget As a
left  JOIN #temp As atc3 On a.WidgetID = atc3.WidgetID  
Order By a.DatePurchased

... runs very fast and the query plan shows correct data volumes at each branch. Each join is a hash join .



When I try to combine the temp table into the main query via a subselect ....


Select  top 61 a.WidgetID,a.WidgetRef,a.Barcode,a.WidgetDesc,a.DatePurchased
,atc3.WidgetValue As CurrencyOriginalValue
From  Widget As a
left  JOIN (Select a.WidgetID, Sum(a.Amount * (b.Sign)) As WidgetValue From  WidgetTransaction As a inner JOIN TransactionType As b On a.TransactionTypeID = b.TransactionTypeID inner JOIN Widget As aass On a.WidgetID = aass.WidgetID inner JOIN CurrentPeriod As cp On aass.CompanyID = cp.CompanyID  Where a.BookID = 1 And b.TransactionTypeClass In ('ADDITION') And cp.UserID = 'ADMIN' And (a.ApplyDate <= cp.EndDate  OR  (a.TransactionTypeID = 1 And aass.DatePurchased <= cp.EndDate)) And a.BudgetID > 0 Group By a.WidgetID) As atc3 On a.WidgetID = atc3.WidgetID  
Order By a.DatePurchased


... this runs relatively slowly. And the final join is a showing a "Table Spool/Lazy Spool" into a nested loops join, and the intermediate table spool itself has 139,000 rows in it. Even though when I run the inner select statement on its own it correctly produces 2300 records.

The main Widget table has 2300 rows and the WidgetTransaction table has 21000 rows.

The problem doesn't occur when I cut down the inner query to a much simpler select statement (even though it selects more records)

When I force it to be a HASH join the problem goes away.

OK it sounds like I've answered my own question with forcing a hash join but I want to understand WHY sql server is choosing a table spool/lazy sppol into a holding table of 139000 rows and is there some logic problem in my sql thats kicked this off ?
Comment
Watch Question

Hi,

just my 2 cents:

Instead of Select .. Into you should use
Insert INTO ... SELECT
and before that create the temp table manually. this way you can set Indexes and primary key.
if your temp table is indexed correctly, the subselect should be much faster.

And one more thing about this statement: I can't see that you join on a temp table here. the tables you are using are normal tables

Select  top 61 a.WidgetID,a.WidgetRef,a.Barcode,a.WidgetDesc,a.DatePurchased
,atc3.WidgetValue As CurrencyOriginalValue
From  Widget As a
left  JOIN (Select a.WidgetID, Sum(a.Amount * (b.Sign)) As WidgetValue From  WidgetTransaction As a inner JOIN TransactionType As b On a.TransactionTypeID = b.TransactionTypeID inner JOIN Widget As aass On a.WidgetID = aass.WidgetID inner JOIN CurrentPeriod As cp On aass.CompanyID = cp.CompanyID  Where a.BookID = 1 And b.TransactionTypeClass In ('ADDITION') And cp.UserID = 'ADMIN' And (a.ApplyDate <= cp.EndDate  OR  (a.TransactionTypeID = 1 And aass.DatePurchased <= cp.EndDate)) And a.BudgetID > 0 Group By a.WidgetID) As atc3 On a.WidgetID = atc3.WidgetID  
Order By a.DatePurchased


HTH

Andy
plq

Author

Commented:
insert into... would be logged and therefore cause excessive transaction log usage on what are queries not updates - that wouldnt scale well. "Select into" is not logged in tempdb (or any db with selectinto/bulkcopy ticked). The temp table is narrow (two columns) so an index isn't going to be any better than a table scan.

The select statement you pasted has the sql source used to create the temp table joined in "in situ"

   ......left join   #temp        as atc3 on .....      vs
   ......left join (select .......) as atc3 on .....

The problem is that the query optimiser is choosing a very inefficient query plan when the temp table select statement is placed inside [......left join (select .......) as atc3 on .....] and I want to understand if there's anything in the sql thats triggering that.

Looking at merge join rules I wondered if I could convince the optimiser to choose a merge join by sorting the inner subquery. So I added an order by (and a top predicate to allow the order by to compile) but still its doing a loop join and a ridiculously large intermediary table in the query plan

The original query that brought this problem to my attention had four subqueries and off the same tables was spooling 5 million rows in each table loop intermediary.

thanks

Senior DBA
CERTIFIED EXPERT
Most Valuable Expert 2018
Distinguished Expert 2019
Commented:
My best guess is that SQL is determing the whole query plan prior to creating the derived table.  Furthermore, it is mis-estimating the number of rows that the inner query will produce (either because full stats aren't available or, more likely, because it uses some default % of the total possible rows).  

SQL thus perhaps estimates that the inner query will be a large result set and the outer one a small result set (relatively).  From BOL:
"A nested loops join is particularly effective if the outer input is quite small and the inner input is preindexed and quite large."

Once a full result set is in a temp table, though, SQL knows the number of rows and so doesn't think one table to be joined will be much large than the other.

As for resolving it, not sure, since, AFAIK, SQL doesn't gen stats for a derived table.

Not the solution you were looking for? Getting a personalized solution is easy.

Ask the Experts
Access more of Experts Exchange with a free account
Thanks for using Experts Exchange.

Create a free account to continue.

Limited access with a free account allows you to:

  • View three pieces of content (articles, solutions, posts, and videos)
  • Ask the experts questions (counted toward content limit)
  • Customize your dashboard and profile

*This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

OR

Please enter a first name

Please enter a last name

8+ characters (letters, numbers, and a symbol)

By clicking, you agree to the Terms of Use and Privacy Policy.