Slow running query

Please don't comment on this question unless you have excellent knowledge of sql server query plans and query analysis. This seems to be a query optimiser problem.


This select into a temp table, and then use the temp table in a join...


Select a.WidgetID, Sum(a.Amount * (b.Sign)) As WidgetValue Into #temp From  WidgetTransaction As a inner JOIN TransactionType As b On a.TransactionTypeID = b.TransactionTypeID inner JOIN Widget As aass On a.WidgetID = aass.WidgetID inner JOIN CurrentPeriod As cp On aass.CompanyID = cp.CompanyID  Where a.BookID = 1 And b.TransactionTypeClass In ('ADDITION') And cp.UserID = 'ADMIN' And (a.ApplyDate <= cp.EndDate  OR  (a.TransactionTypeID = 1 And aass.DatePurchased <= cp.EndDate)) And a.BudgetID > 0 Group By a.WidgetID

Select  top 61 a.WidgetID,a.WidgetRef,a.Barcode,a.WidgetDesc,a.DatePurchased
,atc3.WidgetValue As CurrencyOriginalValue
From  Widget As a
left  JOIN #temp As atc3 On a.WidgetID = atc3.WidgetID  
Order By a.DatePurchased

... runs very fast and the query plan shows correct data volumes at each branch. Each join is a hash join .



When I try to combine the temp table into the main query via a subselect ....


Select  top 61 a.WidgetID,a.WidgetRef,a.Barcode,a.WidgetDesc,a.DatePurchased
,atc3.WidgetValue As CurrencyOriginalValue
From  Widget As a
left  JOIN (Select a.WidgetID, Sum(a.Amount * (b.Sign)) As WidgetValue From  WidgetTransaction As a inner JOIN TransactionType As b On a.TransactionTypeID = b.TransactionTypeID inner JOIN Widget As aass On a.WidgetID = aass.WidgetID inner JOIN CurrentPeriod As cp On aass.CompanyID = cp.CompanyID  Where a.BookID = 1 And b.TransactionTypeClass In ('ADDITION') And cp.UserID = 'ADMIN' And (a.ApplyDate <= cp.EndDate  OR  (a.TransactionTypeID = 1 And aass.DatePurchased <= cp.EndDate)) And a.BudgetID > 0 Group By a.WidgetID) As atc3 On a.WidgetID = atc3.WidgetID  
Order By a.DatePurchased


... this runs relatively slowly. And the final join is a showing a "Table Spool/Lazy Spool" into a nested loops join, and the intermediate table spool itself has 139,000 rows in it. Even though when I run the inner select statement on its own it correctly produces 2300 records.

The main Widget table has 2300 rows and the WidgetTransaction table has 21000 rows.

The problem doesn't occur when I cut down the inner query to a much simpler select statement (even though it selects more records)

When I force it to be a HASH join the problem goes away.

OK it sounds like I've answered my own question with forcing a hash join but I want to understand WHY sql server is choosing a table spool/lazy sppol into a holding table of 139000 rows and is there some logic problem in my sql thats kicked this off ?
LVL 8
plqAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

IT-SchubertzCommented:
Hi,

just my 2 cents:

Instead of Select .. Into you should use
Insert INTO ... SELECT
and before that create the temp table manually. this way you can set Indexes and primary key.
if your temp table is indexed correctly, the subselect should be much faster.

And one more thing about this statement: I can't see that you join on a temp table here. the tables you are using are normal tables

Select  top 61 a.WidgetID,a.WidgetRef,a.Barcode,a.WidgetDesc,a.DatePurchased
,atc3.WidgetValue As CurrencyOriginalValue
From  Widget As a
left  JOIN (Select a.WidgetID, Sum(a.Amount * (b.Sign)) As WidgetValue From  WidgetTransaction As a inner JOIN TransactionType As b On a.TransactionTypeID = b.TransactionTypeID inner JOIN Widget As aass On a.WidgetID = aass.WidgetID inner JOIN CurrentPeriod As cp On aass.CompanyID = cp.CompanyID  Where a.BookID = 1 And b.TransactionTypeClass In ('ADDITION') And cp.UserID = 'ADMIN' And (a.ApplyDate <= cp.EndDate  OR  (a.TransactionTypeID = 1 And aass.DatePurchased <= cp.EndDate)) And a.BudgetID > 0 Group By a.WidgetID) As atc3 On a.WidgetID = atc3.WidgetID  
Order By a.DatePurchased


HTH

Andy
0
plqAuthor Commented:
insert into... would be logged and therefore cause excessive transaction log usage on what are queries not updates - that wouldnt scale well. "Select into" is not logged in tempdb (or any db with selectinto/bulkcopy ticked). The temp table is narrow (two columns) so an index isn't going to be any better than a table scan.

The select statement you pasted has the sql source used to create the temp table joined in "in situ"

   ......left join   #temp        as atc3 on .....      vs
   ......left join (select .......) as atc3 on .....

The problem is that the query optimiser is choosing a very inefficient query plan when the temp table select statement is placed inside [......left join (select .......) as atc3 on .....] and I want to understand if there's anything in the sql thats triggering that.

Looking at merge join rules I wondered if I could convince the optimiser to choose a merge join by sorting the inner subquery. So I added an order by (and a top predicate to allow the order by to compile) but still its doing a loop join and a ridiculously large intermediary table in the query plan

The original query that brought this problem to my attention had four subqueries and off the same tables was spooling 5 million rows in each table loop intermediary.

thanks

0
Scott PletcherSenior DBACommented:
My best guess is that SQL is determing the whole query plan prior to creating the derived table.  Furthermore, it is mis-estimating the number of rows that the inner query will produce (either because full stats aren't available or, more likely, because it uses some default % of the total possible rows).  

SQL thus perhaps estimates that the inner query will be a large result set and the outer one a small result set (relatively).  From BOL:
"A nested loops join is particularly effective if the outer input is quite small and the inner input is preindexed and quite large."

Once a full result set is in a temp table, though, SQL knows the number of rows and so doesn't think one table to be joined will be much large than the other.

As for resolving it, not sure, since, AFAIK, SQL doesn't gen stats for a derived table.
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Microsoft SQL Server

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.