Solved

T-SQL Optimizer injected sort

Posted on 2014-08-15
22
187 Views
Last Modified: 2014-08-21
I have an execution plan which shows a high cost sort, then a merge join and then a high cost filter. The query runs very slowly by the way. I hear "this is an optimizer injected sort to support a merge join" and the advice is to "create an index to support the join" implying then the sort will be replaced by this index. I only can create a non-clustered index. When I look at the order by on the sort in the plan I see a static value from a support table and a value from a temp table. Its hard for me to believe that these items are "high cost" but none the less.

The question is where is the actual problem is. Is it the order by in the sort? The other half of the merge join which is from a clustered index scan? Or is it before the sort in that half of the tree. This is a really simple query. I have tried to get rid of the scan but I am 3 indexes in and its not working.

Where to start?

Thanks
0
Comment
Question by:mike1142
  • 12
  • 6
  • 4
22 Comments
 

Author Comment

by:mike1142
Comment Utility
Here is the anonymized plan. I would like other options instead of a rewrite.

Thanks again.
0
 
LVL 48

Expert Comment

by:PortletPaul
Comment Utility
"Here is the anonymized plan. "

did you intend to add the .sqlplan file?  there is no attachment.

It may be necessary to see the query too (unless its contained in the .sqlplan file)
0
 

Author Comment

by:mike1142
Comment Utility
Hmm, thought I did. Well the query. I just noticed that there is an left outer join and I hadn't expected to see that.

I could do a fiddle but I always find it difficult to know what to include.

Thanks
tmp7F58-Anonymized.sqlplan
0
 
LVL 48

Expert Comment

by:PortletPaul
Comment Utility
Thanks got it, but unable to make much in the way of meaningful comment. There's a table scan (table1 which involves 1.1m rows, but there is an explosion of rows (to 232.8m) when merged with tables 2 and 5 - Then it is sorted prior to a right outer join.

I definitely would need to see the query to make any suggestions - others may be able to see things I can't.
0
 

Author Comment

by:mike1142
Comment Utility
What do you need from the query aspect? Just SELECT This FROM that. Or do you need the schema and data? I am not sure how much would you need without getting deeply into these nested tables that go on an on. It's like pulling on a thread.
0
 

Author Comment

by:mike1142
Comment Utility
This is an estimated plan by the way. Can't get it to run in a decent amount of time to grab an actual.
0
 
LVL 48

Expert Comment

by:PortletPaul
Comment Utility
somehow I'm not surprised by that :)
can we see your current query?
0
 

Author Comment

by:mike1142
Comment Utility
OK I will give it a try

SELECT
d.col1
FROM #Temp AS d
INNER JOIN configtable AS c
ON parameterName = 'Name'
INNER JOIN
AuditTable AS at
ON at.databaseID = c.ParameterValue
AND at.col1 = d.col1
LEFT OUTER JOIN
multiValueTable as mvt
on mvt.databaseID = c.parameterValue
AND mvt.col1 = d.col1
INNER JOIN
StatusTable AS st
on st.databaseID = d.databaseID
AND st.col1 = COALESCE(at.col2, mvt.col2)
AND st.col2 = 'XXX'
WHERE  COALESCE(at.col2, mvt.col2) IS NOT NULL

Open in new window


Hopefully I did not make an idiot out of myself.
0
 
LVL 32

Expert Comment

by:Stefan Hoffmann
Comment Utility
I would not JOIN the configuration table, when possible. And I would take a look a the indices. E.g.

DECLARE @ParameterValue INT;

SELECT	@ParameterValue = ParameterValue 
FROM	configtable c 
WHERE	parameterName = 'Name';

/*
CREATE INDEX IX_AuditTable ON AuditTable ( databaseID, col1, col2 );
CREATE INDEX IX_multiValueTable ON multiValueTable ( databaseID, col1, col2 );
CREATE INDEX IX_StatusTable ON StatusTable ( col2, databaseID, col1 );
CREATE INDEX IX_Temp ON #Temp ( col1 );
*/

SELECT	d.col1
FROM	#Temp AS d	
	INNER JOIN	AuditTable at ON at.databaseID = @ParameterValue AND at.col1 = d.col1
	LEFT JOIN multiValueTable mvt ON mvt.databaseID = @ParameterValue AND mvt.col1 = d.col1
	INNER JOIN StatusTable st ON st.databaseID = d.databaseID AND st.col1 = COALESCE(at.col2, mvt.col2) AND st.col2 = 'XXX'
WHERE  COALESCE(at.col2, mvt.col2) IS NOT NULL;

Open in new window

0
 

Author Comment

by:mike1142
Comment Utility
ste5an,

I assume I have to put the index for the temp table in the sp? Otherwise how will it know it exists?

Thanks for the input.
0
 
LVL 32

Expert Comment

by:Stefan Hoffmann
Comment Utility
Correct. But you need to measure the performance carefully. Depending on the insert strategy the over-all performance may depend on when the index is created:

A1 Create temp table
A2 Create index on temp table
A3 Populate temp table
A4 Run your query

or

B1 Create temp table
B2 Populate temp table
B3 Create index on temp table
B4 Run your query

There are use case when strategy B is better, e.g. when you do more inserts in separate chunks. But you really need to measure and compare the values.
0
Highfive + Dolby Voice = No More Audio Complaints!

Poor audio quality is one of the top reasons people don’t use video conferencing. Get the crispest, clearest audio powered by Dolby Voice in every meeting. Highfive and Dolby Voice deliver the best video conferencing and audio experience for every meeting and every room.

 

Author Comment

by:mike1142
Comment Utility
ste5an,

Thanks for the temp index. Nice to know and would not have thought indexing something in memory would amount to much.

Having said that. Even with indexes suggested this query plan looks no better. I have a transcription error in my code example. Your code corrected to resemble the real query:

DECLARE @ParameterValue INT;

SELECT	@ParameterValue = ParameterValue 
FROM	configtable c 
WHERE	parameterName = 'Name';

/*
CREATE INDEX IX_AuditTable ON AuditTable ( databaseID, col1, col2 );
CREATE INDEX IX_multiValueTable ON multiValueTable ( databaseID, col1, col2 );
CREATE INDEX IX_StatusTable ON StatusTable ( col2, databaseID, col1 );
CREATE INDEX IX_Temp ON #Temp ( col1 );
*/

SELECT	d.col1
FROM	#Temp AS d	
	INNER JOIN	AuditTable at ON at.databaseID = @ParameterValue AND at.col1 = d.col1
	LEFT JOIN multiValueTable mvt ON mvt.databaseID = @ParameterValue AND mvt.col1 = d.col1
	INNER JOIN StatusTable st ON st.databaseID = d.databaseID AND st.col1 = COALESCE(at.col2, mvt.col2) AND st.col2 = 'XXX'
WHERE  COALESCE(at.col3, mvt.col3) IS NOT NULL;

Open in new window


I left our that the where close is actually using 2 different columns which are related to col2. I have corrected the code. Can you please let me know how this changes things?

Thanks
0
 
LVL 32

Expert Comment

by:Stefan Hoffmann
Comment Utility
Another thing, you should test:
WHERE  NOT at.col3 IS NOT NULL
    OR NOT mvt.col3 IS NOT NULL; 

Open in new window


Here are some possible solutions:

SELECT	d.col1
FROM	#Temp AS d	
	INNER JOIN AuditTable at ON at.databaseID = @ParameterValue AND at.col1 = d.col1
	LEFT JOIN multiValueTable mvt ON mvt.databaseID = @ParameterValue AND mvt.col1 = d.col1
	LEFT JOIN StatusTable st1 ON st1.databaseID = d.databaseID AND st1.col1 = at.col2 AND st1.col2 = 'XXX'
	LEFT JOIN StatusTable st2 ON st2.databaseID = d.databaseID AND st2.col1 = mvt.col2 AND st2.col2 = 'XXX' AND at.col2 IS NULL
WHERE  (NOT at.col3 IS NOT NULL
	OR NOT mvt.col3 IS NOT NULL);

	

SELECT	d.col1
FROM	#Temp AS d	
	INNER JOIN AuditTable at ON at.databaseID = @ParameterValue AND at.col1 = d.col1	
	INNER JOIN StatusTable st ON st.databaseID = d.databaseID AND st.col1 = at.col2 AND st.col2 = 'XXX'
WHERE  at.col3 IS NOT NULL;
UNION
SELECT	d.col1
FROM	#Temp AS d	
	LEFT JOIN multiValueTable mvt ON mvt.databaseID = @ParameterValue AND mvt.col1 = d.col1
	INNER JOIN StatusTable st ON st.databaseID = d.databaseID AND st.col1 = mvt.col2 AND st.col2 = 'XXX'
WHERE  mvt.col3 IS NOT NULL;

Open in new window


It really depends on what your result is needed for.
0
 

Author Comment

by:mike1142
Comment Utility
I cannot seem to get rid of the table scan on table 3. I am just not coming up with the right combinations of keys/includes to turn it into a seek. Any hints?
0
 
LVL 48

Expert Comment

by:PortletPaul
Comment Utility
frustrating.

Could you update us on what indexes you now have, and identify which table is Table3?
If there has been any change to the query please supply that also.
(from this side of the browser it's hard to know exactly what you have, or have not, done)
0
 

Author Comment

by:mike1142
Comment Utility
I have essential "rolled back" all of the indexes I have created and trying to start fresh. Here is the original query hopefully clarified.

SELECT 
			D.col8
		FROM 
			#TempTable AS D
				INNER JOIN 
                                ConfigurationTable AS CT
				ON  CT.ParameterName = 'NAME'
				INNER JOIN 				
                                Audit AS AT
				ON  AT.col1 = CT.ParameterValue   
				AND AT.col2 = D.col8                         
				LEFT OUTER JOIN 
                                AuditMult AS AM
				ON  AM.col1 = CT.ParameterValue    
				AND AM.col2 = D.col8                           
				INNER JOIN
                                Status AS ST		
				ON  ST.col1 = D.col3
				AND ST.col3 = COALESCE(AT.col4, AM.col4)
				AND ST.col4 = 'XXX'									
		WHERE 
			COALESCE(AT.col20, AM.7)IS NOT NULL 

Open in new window


So this should be more representative. Don't know if it matters. AT has a non-changeable (not my choice) clustered index on the first 6 columns which are also the primary keys. AM is a subordinate table of AT, same keys + 1. Same clustered index situation.  CT has a clustered index on name and value.

My thoughts based on the estimated execution plan was to first try and change the clustered index scan on AM to a non-clustered seek. No luck, can't even remember what I tried. Ultimately going after the sort on the lower branch by indexing "properly".

Maybe some trivial stats but AT has 3.6 million rows and .5GB disk space. AM has 330K rows and 30MB CT and ST are trivial.

I am stuck right here.
0
 
LVL 48

Expert Comment

by:PortletPaul
Comment Utility
I don't see any reason to alter advice given by st5an on the indexes

Non-Index suggestions

1. Cartesian product?

This looks odd to me:

SELECT
      D.col8
FROM #TempTable AS D
      INNER JOIN ConfigurationTable AS CT
                  ON CT.ParameterName = 'NAME'

normally one would expect to see both tables referenced in the join conditions D.[col-x] and CT.[col-y]

You are getting the effect of a CROSS JOIN (Cartesian product)

If you run this:
SELECT
      count(*)
FROM ConfigurationTable AS CT
WHERE CT.ParameterName = 'NAME'

Is that count more than 1?

2. alternative for COALESCE(AT.col4, AM.col4)


Perhaps try alternates for that join condition.

I would suggest:
      INNER JOIN StatusTable AS st
                  ON st.databaseID = d.databaseID
                        AND st.col2 = 'XXX'
                        AND (
                              st.col1 = at.col2
                            OR
                             (st.col1 = mvt.col2 AND st.col1 IS NULL)
                            )
{+ edit}
a previous "step 3" was removed- sorry
0
 
LVL 32

Assisted Solution

by:Stefan Hoffmann
Stefan Hoffmann earned 250 total points
Comment Utility
What is your D.col8 used for? Can it be a distinct list of values or are mulitple possible occurences of values necessary?

A simple optimization could be to UNION the results or even to use to separate queries and materialize the result in a further temporary table.
0
 

Author Comment

by:mike1142
Comment Utility
Portletpaul. 1 is correct. It is there for a reason, to make a variable column that changes from install to install with a set once and forget it . While it's true you can achieve the same thing with a variable in an sp. this is an attempt to dumb it down a bit more. Once set it can b reference in multiple sp. count = 1.

I'll try the other.

The entire temp is a denominator of a fraction hence the d. The query in question is part of a larger union that creates the numerator as a subset of the denominator.  This section has been narrowed as the culprit of 17 hour run times. So temp table and col8 are unique and col8 provides a link to the other tables used for the numerator.
0
 
LVL 48

Accepted Solution

by:
PortletPaul earned 250 total points
Comment Utility
Then I would call it what it is, a CROSS JOIN

This way everyone will know it is deliberate in the future too.
SELECT
      *
FROM #TempTable AS D
      CROSS JOIN (
                  SELECT -- TOP 1 -- option, guarantee there is only one row
                        *
                  FROM ConfigurationTable
                  WHERE ParameterName = 'NAME'
            ) AS CT

Open in new window

{+ edit} select * isn't meant literally, specify the fields
0
 

Author Comment

by:mike1142
Comment Utility
OK, so I do not think I can interpret ste5an index suggestions based on what I initially gave as the query. Given the new query and especially the index on Status (st) what should the indexes be?

Original
/*
CREATE INDEX IX_AuditTable ON AuditTable ( databaseID, col1, col2 );
CREATE INDEX IX_multiValueTable ON multiValueTable ( databaseID, col1, col2 );
CREATE INDEX IX_StatusTable ON StatusTable ( col2, databaseID, col1 );
CREATE INDEX IX_Temp ON #Temp ( col1 );
*/

Open in new window


Thanks
0
 

Author Closing Comment

by:mike1142
Comment Utility
The vendor came in and "fixed" it by basically moving config table into temp and then rearranged the joins to get rid of the coalesce in the last join. I think you all were on the right track with this.

Thanks guys for your help
0

Featured Post

How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

Join & Write a Comment

PL/SQL can be a very powerful tool for working directly with database tables. Being able to loop will allow you to perform more complex operations, but can be a little tricky to write correctly. This article will provide examples of basic loops alon…
JSON is being used more and more, besides XML, and you surely wanted to parse the data out into SQL instead of doing it in some Javascript. The below function in SQL Server can do the job for you, returning a quick table with the parsed data.
Via a live example, show how to shrink a transaction log file down to a reasonable size.
Viewers will learn how to use the INSERT statement to insert data into their tables. It will also introduce the NULL statement, to show them what happens when no value is giving for any given column.

771 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

12 Experts available now in Live!

Get 1:1 Help Now