T-SQL Optimizer injected sort

I have an execution plan which shows a high cost sort, then a merge join and then a high cost filter. The query runs very slowly by the way. I hear "this is an optimizer injected sort to support a merge join" and the advice is to "create an index to support the join" implying then the sort will be replaced by this index. I only can create a non-clustered index. When I look at the order by on the sort in the plan I see a static value from a support table and a value from a temp table. Its hard for me to believe that these items are "high cost" but none the less.

The question is where is the actual problem is. Is it the order by in the sort? The other half of the merge join which is from a clustered index scan? Or is it before the sort in that half of the tree. This is a really simple query. I have tried to get rid of the scan but I am 3 indexes in and its not working.

Where to start?

Thanks
mike1142Asked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

mike1142Author Commented:
Here is the anonymized plan. I would like other options instead of a rewrite.

Thanks again.
0
PortletPaulfreelancerCommented:
"Here is the anonymized plan. "

did you intend to add the .sqlplan file?  there is no attachment.

It may be necessary to see the query too (unless its contained in the .sqlplan file)
0
mike1142Author Commented:
Hmm, thought I did. Well the query. I just noticed that there is an left outer join and I hadn't expected to see that.

I could do a fiddle but I always find it difficult to know what to include.

Thanks
tmp7F58-Anonymized.sqlplan
0
Ultimate Tool Kit for Technology Solution Provider

Broken down into practical pointers and step-by-step instructions, the IT Service Excellence Tool Kit delivers expert advice for technology solution providers. Get your free copy now.

PortletPaulfreelancerCommented:
Thanks got it, but unable to make much in the way of meaningful comment. There's a table scan (table1 which involves 1.1m rows, but there is an explosion of rows (to 232.8m) when merged with tables 2 and 5 - Then it is sorted prior to a right outer join.

I definitely would need to see the query to make any suggestions - others may be able to see things I can't.
0
mike1142Author Commented:
What do you need from the query aspect? Just SELECT This FROM that. Or do you need the schema and data? I am not sure how much would you need without getting deeply into these nested tables that go on an on. It's like pulling on a thread.
0
mike1142Author Commented:
This is an estimated plan by the way. Can't get it to run in a decent amount of time to grab an actual.
0
PortletPaulfreelancerCommented:
somehow I'm not surprised by that :)
can we see your current query?
0
mike1142Author Commented:
OK I will give it a try

SELECT
d.col1
FROM #Temp AS d
INNER JOIN configtable AS c
ON parameterName = 'Name'
INNER JOIN
AuditTable AS at
ON at.databaseID = c.ParameterValue
AND at.col1 = d.col1
LEFT OUTER JOIN
multiValueTable as mvt
on mvt.databaseID = c.parameterValue
AND mvt.col1 = d.col1
INNER JOIN
StatusTable AS st
on st.databaseID = d.databaseID
AND st.col1 = COALESCE(at.col2, mvt.col2)
AND st.col2 = 'XXX'
WHERE  COALESCE(at.col2, mvt.col2) IS NOT NULL

Open in new window


Hopefully I did not make an idiot out of myself.
0
ste5anSenior DeveloperCommented:
I would not JOIN the configuration table, when possible. And I would take a look a the indices. E.g.

DECLARE @ParameterValue INT;

SELECT	@ParameterValue = ParameterValue 
FROM	configtable c 
WHERE	parameterName = 'Name';

/*
CREATE INDEX IX_AuditTable ON AuditTable ( databaseID, col1, col2 );
CREATE INDEX IX_multiValueTable ON multiValueTable ( databaseID, col1, col2 );
CREATE INDEX IX_StatusTable ON StatusTable ( col2, databaseID, col1 );
CREATE INDEX IX_Temp ON #Temp ( col1 );
*/

SELECT	d.col1
FROM	#Temp AS d	
	INNER JOIN	AuditTable at ON at.databaseID = @ParameterValue AND at.col1 = d.col1
	LEFT JOIN multiValueTable mvt ON mvt.databaseID = @ParameterValue AND mvt.col1 = d.col1
	INNER JOIN StatusTable st ON st.databaseID = d.databaseID AND st.col1 = COALESCE(at.col2, mvt.col2) AND st.col2 = 'XXX'
WHERE  COALESCE(at.col2, mvt.col2) IS NOT NULL;

Open in new window

0
mike1142Author Commented:
ste5an,

I assume I have to put the index for the temp table in the sp? Otherwise how will it know it exists?

Thanks for the input.
0
ste5anSenior DeveloperCommented:
Correct. But you need to measure the performance carefully. Depending on the insert strategy the over-all performance may depend on when the index is created:

A1 Create temp table
A2 Create index on temp table
A3 Populate temp table
A4 Run your query

or

B1 Create temp table
B2 Populate temp table
B3 Create index on temp table
B4 Run your query

There are use case when strategy B is better, e.g. when you do more inserts in separate chunks. But you really need to measure and compare the values.
0
mike1142Author Commented:
ste5an,

Thanks for the temp index. Nice to know and would not have thought indexing something in memory would amount to much.

Having said that. Even with indexes suggested this query plan looks no better. I have a transcription error in my code example. Your code corrected to resemble the real query:

DECLARE @ParameterValue INT;

SELECT	@ParameterValue = ParameterValue 
FROM	configtable c 
WHERE	parameterName = 'Name';

/*
CREATE INDEX IX_AuditTable ON AuditTable ( databaseID, col1, col2 );
CREATE INDEX IX_multiValueTable ON multiValueTable ( databaseID, col1, col2 );
CREATE INDEX IX_StatusTable ON StatusTable ( col2, databaseID, col1 );
CREATE INDEX IX_Temp ON #Temp ( col1 );
*/

SELECT	d.col1
FROM	#Temp AS d	
	INNER JOIN	AuditTable at ON at.databaseID = @ParameterValue AND at.col1 = d.col1
	LEFT JOIN multiValueTable mvt ON mvt.databaseID = @ParameterValue AND mvt.col1 = d.col1
	INNER JOIN StatusTable st ON st.databaseID = d.databaseID AND st.col1 = COALESCE(at.col2, mvt.col2) AND st.col2 = 'XXX'
WHERE  COALESCE(at.col3, mvt.col3) IS NOT NULL;

Open in new window


I left our that the where close is actually using 2 different columns which are related to col2. I have corrected the code. Can you please let me know how this changes things?

Thanks
0
ste5anSenior DeveloperCommented:
Another thing, you should test:
WHERE  NOT at.col3 IS NOT NULL
    OR NOT mvt.col3 IS NOT NULL; 

Open in new window


Here are some possible solutions:

SELECT	d.col1
FROM	#Temp AS d	
	INNER JOIN AuditTable at ON at.databaseID = @ParameterValue AND at.col1 = d.col1
	LEFT JOIN multiValueTable mvt ON mvt.databaseID = @ParameterValue AND mvt.col1 = d.col1
	LEFT JOIN StatusTable st1 ON st1.databaseID = d.databaseID AND st1.col1 = at.col2 AND st1.col2 = 'XXX'
	LEFT JOIN StatusTable st2 ON st2.databaseID = d.databaseID AND st2.col1 = mvt.col2 AND st2.col2 = 'XXX' AND at.col2 IS NULL
WHERE  (NOT at.col3 IS NOT NULL
	OR NOT mvt.col3 IS NOT NULL);

	

SELECT	d.col1
FROM	#Temp AS d	
	INNER JOIN AuditTable at ON at.databaseID = @ParameterValue AND at.col1 = d.col1	
	INNER JOIN StatusTable st ON st.databaseID = d.databaseID AND st.col1 = at.col2 AND st.col2 = 'XXX'
WHERE  at.col3 IS NOT NULL;
UNION
SELECT	d.col1
FROM	#Temp AS d	
	LEFT JOIN multiValueTable mvt ON mvt.databaseID = @ParameterValue AND mvt.col1 = d.col1
	INNER JOIN StatusTable st ON st.databaseID = d.databaseID AND st.col1 = mvt.col2 AND st.col2 = 'XXX'
WHERE  mvt.col3 IS NOT NULL;

Open in new window


It really depends on what your result is needed for.
0
mike1142Author Commented:
I cannot seem to get rid of the table scan on table 3. I am just not coming up with the right combinations of keys/includes to turn it into a seek. Any hints?
0
PortletPaulfreelancerCommented:
frustrating.

Could you update us on what indexes you now have, and identify which table is Table3?
If there has been any change to the query please supply that also.
(from this side of the browser it's hard to know exactly what you have, or have not, done)
0
mike1142Author Commented:
I have essential "rolled back" all of the indexes I have created and trying to start fresh. Here is the original query hopefully clarified.

SELECT 
			D.col8
		FROM 
			#TempTable AS D
				INNER JOIN 
                                ConfigurationTable AS CT
				ON  CT.ParameterName = 'NAME'
				INNER JOIN 				
                                Audit AS AT
				ON  AT.col1 = CT.ParameterValue   
				AND AT.col2 = D.col8                         
				LEFT OUTER JOIN 
                                AuditMult AS AM
				ON  AM.col1 = CT.ParameterValue    
				AND AM.col2 = D.col8                           
				INNER JOIN
                                Status AS ST		
				ON  ST.col1 = D.col3
				AND ST.col3 = COALESCE(AT.col4, AM.col4)
				AND ST.col4 = 'XXX'									
		WHERE 
			COALESCE(AT.col20, AM.7)IS NOT NULL 

Open in new window


So this should be more representative. Don't know if it matters. AT has a non-changeable (not my choice) clustered index on the first 6 columns which are also the primary keys. AM is a subordinate table of AT, same keys + 1. Same clustered index situation.  CT has a clustered index on name and value.

My thoughts based on the estimated execution plan was to first try and change the clustered index scan on AM to a non-clustered seek. No luck, can't even remember what I tried. Ultimately going after the sort on the lower branch by indexing "properly".

Maybe some trivial stats but AT has 3.6 million rows and .5GB disk space. AM has 330K rows and 30MB CT and ST are trivial.

I am stuck right here.
0
PortletPaulfreelancerCommented:
I don't see any reason to alter advice given by st5an on the indexes

Non-Index suggestions

1. Cartesian product?

This looks odd to me:

SELECT
      D.col8
FROM #TempTable AS D
      INNER JOIN ConfigurationTable AS CT
                  ON CT.ParameterName = 'NAME'

normally one would expect to see both tables referenced in the join conditions D.[col-x] and CT.[col-y]

You are getting the effect of a CROSS JOIN (Cartesian product)

If you run this:
SELECT
      count(*)
FROM ConfigurationTable AS CT
WHERE CT.ParameterName = 'NAME'

Is that count more than 1?

2. alternative for COALESCE(AT.col4, AM.col4)


Perhaps try alternates for that join condition.

I would suggest:
      INNER JOIN StatusTable AS st
                  ON st.databaseID = d.databaseID
                        AND st.col2 = 'XXX'
                        AND (
                              st.col1 = at.col2
                            OR
                             (st.col1 = mvt.col2 AND st.col1 IS NULL)
                            )
{+ edit}
a previous "step 3" was removed- sorry
0
ste5anSenior DeveloperCommented:
What is your D.col8 used for? Can it be a distinct list of values or are mulitple possible occurences of values necessary?

A simple optimization could be to UNION the results or even to use to separate queries and materialize the result in a further temporary table.
0
mike1142Author Commented:
Portletpaul. 1 is correct. It is there for a reason, to make a variable column that changes from install to install with a set once and forget it . While it's true you can achieve the same thing with a variable in an sp. this is an attempt to dumb it down a bit more. Once set it can b reference in multiple sp. count = 1.

I'll try the other.

The entire temp is a denominator of a fraction hence the d. The query in question is part of a larger union that creates the numerator as a subset of the denominator.  This section has been narrowed as the culprit of 17 hour run times. So temp table and col8 are unique and col8 provides a link to the other tables used for the numerator.
0
PortletPaulfreelancerCommented:
Then I would call it what it is, a CROSS JOIN

This way everyone will know it is deliberate in the future too.
SELECT
      *
FROM #TempTable AS D
      CROSS JOIN (
                  SELECT -- TOP 1 -- option, guarantee there is only one row
                        *
                  FROM ConfigurationTable
                  WHERE ParameterName = 'NAME'
            ) AS CT

Open in new window

{+ edit} select * isn't meant literally, specify the fields
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
mike1142Author Commented:
OK, so I do not think I can interpret ste5an index suggestions based on what I initially gave as the query. Given the new query and especially the index on Status (st) what should the indexes be?

Original
/*
CREATE INDEX IX_AuditTable ON AuditTable ( databaseID, col1, col2 );
CREATE INDEX IX_multiValueTable ON multiValueTable ( databaseID, col1, col2 );
CREATE INDEX IX_StatusTable ON StatusTable ( col2, databaseID, col1 );
CREATE INDEX IX_Temp ON #Temp ( col1 );
*/

Open in new window


Thanks
0
mike1142Author Commented:
The vendor came in and "fixed" it by basically moving config table into temp and then rearranged the joins to get rid of the coalesce in the last join. I think you all were on the right track with this.

Thanks guys for your help
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Microsoft SQL Server

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.