analyze query plan with profiler

Dear all,

I am reading this: http://www.sql-server-performance.com/2013/how-to-optimize-sql-server-query-plan/

it said many thing  I don't understand:
1) analyze query plan with profiler! how can profiler analyse query plan ? profiler can use to check what has been ran. but analyzes query plan is .... ?
2) "so we can drop IDENTITY column in favor of a composite primary key:", what is the relationship ?
3) "To avoid the table spool reading, let’s try to rewrite this query by using ROW_NUMBER:", what is the use of row_number there that can take out the spool ?
LVL 1
marrowyungSenior Technical architecture (Data)Asked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

ste5anSenior DeveloperCommented:
1) it's referring to the dbForge tool.

2a) Foreign key relationships don't need primary keys. A unique index is sufficient.
2b) Changing the PK by dropping the IDENTIY column is indeed nonsense. It would be sufficient to make that PK non-clustered and add an unique clustered index for ([user_id], login_date).

3a) The ROW_NUMBER() approach means that you create an order criteria which can be later used in a predicate. It is a replacement for correlated subqueries with a partition based on the first and the rest. It is simpler better compiled to an execution plan than a subquery.

3b) There is subtle flaw in the example. The original DELETE based on login_id is wrong or at least error prone. Even when a user cannot login at the same point in time, he can login twice with 1ms difference. Then - depending on the INSERT procedure - the later row may be inserted first.
Thus: The DELETE statements are not logically equal.
1

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
marrowyungSenior Technical architecture (Data)Author Commented:
"1) it's referring to the dbForge tool"

oh, it should be print as "Query Profiler, presented IN indbForge Studio for SQLServer", right?

"2a) Foreign key relationships don't need primary keys. A unique index is sufficient."

yes I agree! but it is not talking about foreign key, right?  basically comparing with primary keys, what is the good of unique index ?

 2b) Changing the PK by dropping the IDENTIY column is indeed nonsense. It would be sufficient to make that PK non-clustered and add an unique clustered index for ([user_id], login_date).
"

why make PK non-clustered and replace it with unique is good and why your unique clustered index do not have DESC ?

What is the diff of PK and unique PK?  I think PK already uniquely identify the each row.


"3a) The ROW_NUMBER() approach means that you create an order criteria which can be later used in a predicate"

the predicate means the condition in the where clause ?

"It is a replacement for correlated subqueries with a partition based on the first and the rest.  It is simpler better compiled to an execution plan than a subquery.
"

just because the OVER statement but not the ROW_NUMBER() function, right?

so the row_number() do not do anything with taking out the table spooling ?

"Thus: The DELETE statements are not logically equal. "

so you mean business logic will show something wrong later as it delete wrong data?
0
ste5anSenior DeveloperCommented:
1) I guess so.

2) They are talking about modifying the table by dropping a column. This is a model change. Without the knowledge of the complete requirements, the reader cannot tell whether this is necessary or not.
Thus it is a poor sample. As it is about optimizing, the writer should optimize, not change the model. (*)

2a) The primary key is only an emphasized candidate key. Every candidate key in a relation (table) must be unique, thus the table should have an UNIQUE CONSTRAINT for it. A UNIQUE INDEX is the implementation detail of the logical UNIQUE CONSTRAINT. Thus this is just about the unknown model.

2b) Cause dropping the column a model change, see 2).
The original table creation statement uses the column inline PK directive. Thus the primary key is created as clustered index. This is a SQL Server default.  Creating a unique clustered index, note the difference index, not constraint, cause constraint would be a model question, is equivalent to the new PK. I have not use a DESC, cause I have just quick-read that article..

3a) Yup. Nope. The entire ROW_NUMBER() OVER () is necessary to get the same ordered values as the correlated subquery does.

3b) The original logic was flawed. The new is not.

(*) Being a developer and DBA over the years, I have developed my own view of point:

There are different roles in the entire software development, deployment and operations life cycle . This is an implication of the separation of concerns principle, when applied to general processes.

In the development role you should only develop. Thus create correct models - fully normalized  - and the necessary access methods like views, procedures, functions. Here I recommend to use functions when they match the development process and concrete problem, even when as a DBA I know that they may introduce severe performance problems.
In the DBA role I only manage the server and also optimize the performance. But I never change the model. When I get to the point that optimizing means changing the model, then a new iteration in the entire process is necessary.
This means: Why do I want to change it? Is the change covered by the requirements? When this is the fact, what is the error in the model (because I believe that there can be only one correct data model for on set of requirements, when these are accurate, complete and consistent)? Or do we need to change the requirements first. Or in short, we have the third role we fulfill: the architect role.

It's important to fulfill only one role at a point in time. Thus I don't think that the article is a good one.
1
Ultimate Tool Kit for Technology Solution Provider

Broken down into practical pointers and step-by-step instructions, the IT Service Excellence Tool Kit delivers expert advice for technology solution providers. Get your free copy now.

marrowyungSenior Technical architecture (Data)Author Commented:
" 2b) Changing the PK by dropping the IDENTIY column is indeed nonsense. It would be sufficient to make that PK non-clustered and add an unique clustered index for ([user_id], login_date)."

wait, by this: unique clustered index for ([user_id], login_date), is that mean userid and login_date has the same sort order physically ?

what if one day developer sort userid and login_date in diff order and it will be very slow ?
0
ste5anSenior DeveloperCommented:
A table can have only one clustered index, because what ever index is chosen, primary or not, the clustered property defines the physical order of that table.

When your question refers to the sample and the fact that I missed that DESC part: as I already wrote I haven't read the sample before my first answer in the depth. And it is detail that is for the general answer irrelevant.

In the concrete case of that sample: What ever index is made the clustered one, it must be  ([user_id], login_date DESC) to work for that sample.

what if one day developer sort userid and login_date in diff order and it will be very slow ?
No. Because "very slow" is not a term that applies here.
Will it get the same performance as when using the original order? Normally not. Will it make a difference? This only depends on the amount of data and the cardinality of the rows per user. When it is about performance: You need to test always different alternatives. In many situations there is no guarantee that a certain logical approach will work. Because the optimizer depends strongly on statistics and cardinality estimates. These are implementation details, which are hard to foresee.
0
marrowyungSenior Technical architecture (Data)Author Commented:
"thus the table should have an UNIQUE CONSTRAINT for it. "

good, so this means PK with unique constraint and it is perfect/good enough already ?

"A UNIQUE INDEX is the implementation detail of the logical UNIQUE CONSTRAINT"

I don't understand this , please explain in another way .

"In the DBA role I only manage the server and also optimize the performance. But I never change the model. When I get to the point that optimizing means changing the model, then a new iteration in the entire process is necessary. "

I think the new iteration in the entire process, can mean system house keeping, ie.g. arrange process run time , review process dependence,  review system daily/weekly/monthly performance with reason. right?

" Or do we need to change the requirements first. Or in short, we have the third role we fulfill: the architect role"

yeah.. if business requirement change the data model need to change.  

in your side, what the architect do ?

"It's important to fulfill only one role at a point in time. "

in our side, it is the same but from time to time, developer here ask DBA to program T-SQL too, which they can't be involve intensively.

"3a) Yup. Nope. The entire ROW_NUMBER() OVER () is necessary to get the same ordered values as the correlated subquery does."

please give some URL/source on how ROW_NUMBER() OVER () is necessary to get the same ordered values , tks. I don't get this bit.
0
ste5anSenior DeveloperCommented:
CONSTRAINTS are part of the logical model. They define properties of the model. Something must be unique or check whether the input is valid.

INDICES are part of the physical implementation. They are the instrument to enforce the logic and its constraints.

The "entire process" includes also reviewing and when necessary modifying the catalog of requirements.
It's not about changes of business requirements. It's about the fact, that the catalog of (the existing) requirements was either not complete or not precise (error free) or not consistent. But none had spotted those weaknesses. But they are now leading to those "errors" or needs to optimize something which needs a model change.

"DBA to program T-SQL too" Well, as long as the DBA can switch to a developer role, then there is nothing wrong about it.

please give some URL/source on how ROW_NUMBER() OVER () is necessary to get the same ordered values , tks. I don't get this bit.
That is the basic problem: The linked article is about that, but is a bad one..
0
marrowyungSenior Technical architecture (Data)Author Commented:
"That is the basic problem: The linked article is about that, but is a bad one.. "

yeah, got it. some time read the wrong article and need you all helps.

"The "entire process" includes also reviewing and when necessary modifying the catalog of requirements.
 It's not about changes of business requirements. It's about the fact, that the catalog of (the existing) requirements was either not complete or not precise (error free) or not consistent."

developer's work !

""DBA to program T-SQL too" Well, as long as the DBA can switch to a developer role, then there is nothing wrong about it."

this is what I am doing ! infrastructure DBA is done for me. I want to do more or I can't help developer here to optimized the query, optimize the query based on infrastructure option can't do much !
e.g. optimized the tempDB structure. just one time and can't do much.


after chatting with you I have this in mind, is that right?

A) Develop a good data modelling, it is the root of everything.
B) Then define good clustered index, by default: primary KEY.
C) Make sure Unique constraint is there, which will create the respective unique index to ensure that.
D) The order of column in the index does matter, the order of index and join does not matter
E) Then query performance not depends on the order of the join for performance tuning except:
  1) FORCE ORDER query hint: https://msdn.microsoft.com/en-us/library/ms181714.aspx
  2) Forcing Query Plans:  https://technet.microsoft.com/en-us/library/cc917694.aspx
0
ste5anSenior DeveloperCommented:
Yup :)
0
marrowyungSenior Technical architecture (Data)Author Commented:
sorry, on which part ?
0
ste5anSenior DeveloperCommented:
*
0
marrowyungSenior Technical architecture (Data)Author Commented:
select *, right ? ::):)
0
ste5anSenior DeveloperCommented:
yup ;)
0
marrowyungSenior Technical architecture (Data)Author Commented:
I am sorry that, as I said, joined a new job and they give me a lot of thing everyday and I don't have time to reply it.
0
marrowyungSenior Technical architecture (Data)Author Commented:
but one little question, is the row_number () over ... is also used to solve triangular joins too ?
0
ste5anSenior DeveloperCommented:
Hae?

What is a triangular JOIN?

ROW_NUMBER() is a efficient way to number your date over a window.
0
marrowyungSenior Technical architecture (Data)Author Commented:
"What is a triangular JOIN?"

a hiden RBAR operation, http://www.sqlservercentral.com/articles/T-SQL/61539/

the problem as it seems it is not a RBAR but actually it is.  the performance is even worse than while/cursor.

" over a window."

over a data set by a query with where clause you mean ?

then remember the row_number() is good anyway.
0
marrowyungSenior Technical architecture (Data)Author Commented:
tks you are great !@
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Microsoft SQL Server 2008

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.