?
Solved

Slow running query

Posted on 2006-03-20
3
Medium Priority
?
489 Views
Last Modified: 2010-07-27
Please don't comment on this question unless you have excellent knowledge of sql server query plans and query analysis. This seems to be a query optimiser problem.


This select into a temp table, and then use the temp table in a join...


Select a.WidgetID, Sum(a.Amount * (b.Sign)) As WidgetValue Into #temp From  WidgetTransaction As a inner JOIN TransactionType As b On a.TransactionTypeID = b.TransactionTypeID inner JOIN Widget As aass On a.WidgetID = aass.WidgetID inner JOIN CurrentPeriod As cp On aass.CompanyID = cp.CompanyID  Where a.BookID = 1 And b.TransactionTypeClass In ('ADDITION') And cp.UserID = 'ADMIN' And (a.ApplyDate <= cp.EndDate  OR  (a.TransactionTypeID = 1 And aass.DatePurchased <= cp.EndDate)) And a.BudgetID > 0 Group By a.WidgetID

Select  top 61 a.WidgetID,a.WidgetRef,a.Barcode,a.WidgetDesc,a.DatePurchased
,atc3.WidgetValue As CurrencyOriginalValue
From  Widget As a
left  JOIN #temp As atc3 On a.WidgetID = atc3.WidgetID  
Order By a.DatePurchased

... runs very fast and the query plan shows correct data volumes at each branch. Each join is a hash join .



When I try to combine the temp table into the main query via a subselect ....


Select  top 61 a.WidgetID,a.WidgetRef,a.Barcode,a.WidgetDesc,a.DatePurchased
,atc3.WidgetValue As CurrencyOriginalValue
From  Widget As a
left  JOIN (Select a.WidgetID, Sum(a.Amount * (b.Sign)) As WidgetValue From  WidgetTransaction As a inner JOIN TransactionType As b On a.TransactionTypeID = b.TransactionTypeID inner JOIN Widget As aass On a.WidgetID = aass.WidgetID inner JOIN CurrentPeriod As cp On aass.CompanyID = cp.CompanyID  Where a.BookID = 1 And b.TransactionTypeClass In ('ADDITION') And cp.UserID = 'ADMIN' And (a.ApplyDate <= cp.EndDate  OR  (a.TransactionTypeID = 1 And aass.DatePurchased <= cp.EndDate)) And a.BudgetID > 0 Group By a.WidgetID) As atc3 On a.WidgetID = atc3.WidgetID  
Order By a.DatePurchased


... this runs relatively slowly. And the final join is a showing a "Table Spool/Lazy Spool" into a nested loops join, and the intermediate table spool itself has 139,000 rows in it. Even though when I run the inner select statement on its own it correctly produces 2300 records.

The main Widget table has 2300 rows and the WidgetTransaction table has 21000 rows.

The problem doesn't occur when I cut down the inner query to a much simpler select statement (even though it selects more records)

When I force it to be a HASH join the problem goes away.

OK it sounds like I've answered my own question with forcing a hash join but I want to understand WHY sql server is choosing a table spool/lazy sppol into a holding table of 139000 rows and is there some logic problem in my sql thats kicked this off ?
0
Comment
Question by:plq
3 Comments
 
LVL 7

Expert Comment

by:IT-Schubertz
ID: 16235053
Hi,

just my 2 cents:

Instead of Select .. Into you should use
Insert INTO ... SELECT
and before that create the temp table manually. this way you can set Indexes and primary key.
if your temp table is indexed correctly, the subselect should be much faster.

And one more thing about this statement: I can't see that you join on a temp table here. the tables you are using are normal tables

Select  top 61 a.WidgetID,a.WidgetRef,a.Barcode,a.WidgetDesc,a.DatePurchased
,atc3.WidgetValue As CurrencyOriginalValue
From  Widget As a
left  JOIN (Select a.WidgetID, Sum(a.Amount * (b.Sign)) As WidgetValue From  WidgetTransaction As a inner JOIN TransactionType As b On a.TransactionTypeID = b.TransactionTypeID inner JOIN Widget As aass On a.WidgetID = aass.WidgetID inner JOIN CurrentPeriod As cp On aass.CompanyID = cp.CompanyID  Where a.BookID = 1 And b.TransactionTypeClass In ('ADDITION') And cp.UserID = 'ADMIN' And (a.ApplyDate <= cp.EndDate  OR  (a.TransactionTypeID = 1 And aass.DatePurchased <= cp.EndDate)) And a.BudgetID > 0 Group By a.WidgetID) As atc3 On a.WidgetID = atc3.WidgetID  
Order By a.DatePurchased


HTH

Andy
0
 
LVL 8

Author Comment

by:plq
ID: 16235364
insert into... would be logged and therefore cause excessive transaction log usage on what are queries not updates - that wouldnt scale well. "Select into" is not logged in tempdb (or any db with selectinto/bulkcopy ticked). The temp table is narrow (two columns) so an index isn't going to be any better than a table scan.

The select statement you pasted has the sql source used to create the temp table joined in "in situ"

   ......left join   #temp        as atc3 on .....      vs
   ......left join (select .......) as atc3 on .....

The problem is that the query optimiser is choosing a very inefficient query plan when the temp table select statement is placed inside [......left join (select .......) as atc3 on .....] and I want to understand if there's anything in the sql thats triggering that.

Looking at merge join rules I wondered if I could convince the optimiser to choose a merge join by sorting the inner subquery. So I added an order by (and a top predicate to allow the order by to compile) but still its doing a loop join and a ridiculously large intermediary table in the query plan

The original query that brought this problem to my attention had four subqueries and off the same tables was spooling 5 million rows in each table loop intermediary.

thanks

0
 
LVL 70

Accepted Solution

by:
Scott Pletcher earned 2000 total points
ID: 16241375
My best guess is that SQL is determing the whole query plan prior to creating the derived table.  Furthermore, it is mis-estimating the number of rows that the inner query will produce (either because full stats aren't available or, more likely, because it uses some default % of the total possible rows).  

SQL thus perhaps estimates that the inner query will be a large result set and the outer one a small result set (relatively).  From BOL:
"A nested loops join is particularly effective if the outer input is quite small and the inner input is preindexed and quite large."

Once a full result set is in a temp table, though, SQL knows the number of rows and so doesn't think one table to be joined will be much large than the other.

As for resolving it, not sure, since, AFAIK, SQL doesn't gen stats for a derived table.
0

Featured Post

Free Tool: Port Scanner

Check which ports are open to the outside world. Helps make sure that your firewall rules are working as intended.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

I have a large data set and a SSIS package. How can I load this file in multi threading?
This article shows gives you an overview on SQL Server 2016 row level security. You will also get to know the usages of row-level-security and how it works
Familiarize people with the process of utilizing SQL Server functions from within Microsoft Access. Microsoft Access is a very powerful client/server development tool. One of the SQL Server objects that you can interact with from within Microsoft Ac…
Viewers will learn how the fundamental information of how to create a table.
Suggested Courses

839 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question