right candidates of clustered key/index and unique key/index.

Dear all,

from SQL server performance tuning point of view, the primary key, clustered key and unique key is very important as they define rules on data so that data can be easily access/quickly access.

any suggestion/real world best practice for developer/application designer to design/assign the right clustered key/index and unique key/index?
LVL 1
marrowyungSenior Technical architecture (Data)Asked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

ste5anSenior DeveloperCommented:
I like to separate the development tasks or role from  the DBA role.

In the sense of Don Knuth: Make it run, make it right, make it fast.
Applies as:

Create a correct data model. Normalize it (1NF-4NF, BCNF, DKNF). Especially: Don't denormalize tables. Performance optimization is prime DBA task.

Unique constraints are part of the model and must be used as required. Every candidate key should be  marked by creating a unique constraint for it.
In a normalized model you will not have many tables with more than one candidate key. Thus picking the primary key is obvious.
Whether you create a clustered index or not is a DBA task. Use the defaults. Optimize it later.

So from the development/application designer view point: Just create a correct, complete model.

From Performance tuning point of view:
It does not matter what the primary key is or whether you have unique constraints. It only matters whether the table is heap or not. Then you need a clustered index which supports the main usage pattern. The rest is query tuning by rewriting them and using proper indexing.
Denormalizing may be used in some cases. But: In most of this cases it is a modelling problem: Relational modelling (OLTP) vs. dimensional modelling (OLAP).

The last sentence is important for developers: Often the requirements are not clear. And you'll end up with a plain relational model, but the correct solution would be having an additional dimensional model.
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
marrowyungSenior Technical architecture (Data)Author Commented:
"I like to separate the development tasks or role from  the DBA role. "

sure this is our world in here, infrastructure DBA don't do and know how to tune query as they usually don't know how to add/design an index.

"Create a correct data model. Normalize it (1NF-4NF, BCNF, DKNF). Especially: Don't denormalize tables. Performance optimization is prime DBA task."

why don't denormalize table ? if we see too many join then we need to denormalize it, right?
data warehouse solution can have 2nd normal form, right ?

"Unique constraints are part of the model and must be used as required. Every candidate key should be  marked by creating a unique constraint for it.
 In a normalized model you will not have many tables with more than one candidate key. Thus picking the primary key is obvious."

so primary key is one of the unique key, right ? that's what I am wondering in my other post: http://www.experts-exchange.com/questions/28714594/analyze-query-plan-with-profiler.html#a40981333

so do you mean creating Unique constraints is better then unique index to make each row unique.

"Use the defaults. Optimize it later.
"

defaults means ?

"So from the development/application designer view point: Just create a correct, complete model.
"

yeah, it make the query later on easier.

"Then you need a clustered index which supports the main usage pattern"

so it is not a heap any more and it can be faster as it has an ordered index anyway ?

"The rest is query tuning by rewriting them and using proper indexing. "

yeah, non-clustered later on, right? but I heard that no matter what non-clustered index is created, it is no used BEFORE clustered index create, right?

as the non-clustered index will make SQL server look back to the cluster index anyway as it is the data page itself?

so if the clustered index not helping much we can ADD (not dropping the primary key) unique index to make it faster, right? so clustered index and unique can be create on the same table ?

"But: In most of this cases it is a modelling problem: Relational modelling (OLTP) vs. dimensional modelling (OLAP). "

yeah, I heard about that, it is the root of everything, right?
0
ste5anSenior DeveloperCommented:
why don't denormalize table ? if we see too many join then we need to denormalize it, right?
data warehouse solution can have 2nd normal form, right ?

There is no place for denormalization in the relational model. As you mentioned a data warehouse (DW): A DW uses a different model. In most cases - I hope so - the dimensional model. Here we have a different structure.
The task optimizing by denormalization is often an indicator for the fact, that the developer has chosen the wrong model in this place.
When it's still necessary to make it run, but the model is right, then it is often an architectural flaw: someone chose the wrong stack.

so do you mean creating Unique constraints is better then unique index to make each row unique.
A UNIQUE CONSTRAINT is a logical construct and part of the model, a UNIQUE INDEX is the concrete (physical) implementation used for a UNIQUE CONSTRAINT. It's model vs. implementation detail. As a developer I don't care (not that much) about implementation detail, when I create the model.

Use the defaults. Optimize it later.
As part of the modelling you define your primary keys. Whether they are clustered or not is irrelevant. Thus the first scripts used to create your tables can use the defaults.

clustered vs. heap
Whether you need a clustered or heap table depends on the number of row, on the insert,delete and select patterns.
A clustered index may be used instead of an non-clustered index, when for example other values must be read. Here a clustered index seek can be faster than a non-clustered index seek and a key-lookup.

yeah, I heard about that, it is the root of everything, right?
Yup, it means: failed at step 2 (make it right).
0
Ultimate Tool Kit for Technology Solution Provider

Broken down into practical pointers and step-by-step instructions, the IT Service Excellence Tool Kit delivers expert advice for technology solution providers. Get your free copy now.

Scott PletcherSenior DBACommented:
"Create a correct data model. Normalize it (1NF-4NF, BCNF, DKNF). Especially: Don't denormalize tables. Performance optimization is prime DBA task."

why don't denormalize table ? if we see too many join then we need to denormalize it, right?
data warehouse solution can have 2nd normal form, right ?

This can be confusing if you don't do logical modeling first, before you ever consider physical modeling.  That's very common, to skip/ignore logical modeling, even though it's a huge mistake.

"Tables" don't even exist in a logical model, they are "entities" instead.  The type/number of "joins" that would ultimately be required in a matching physical model is 100% irrelevant, because the logical model does not consider performance in any way.  Computers are assumed to be infinitely fast during the logical modeling stage.
0
marrowyungSenior Technical architecture (Data)Author Commented:
ste5an,


"The task optimizing by denormalization is often an indicator for the fact, that the developer has chosen the wrong model in this place."

so you saw that happen before, right? I am not sure why we can't denormalize  it.

"A UNIQUE CONSTRAINT is a logical construct and part of the model, a UNIQUE INDEX is the concrete (physical) implementation used for a UNIQUE CONSTRAINT. It's model vs. implementation detail. As a developer I don't care (not that much) about implementation detail, when I create the model."

I like that, but it doesn't means that if we define the unique constraint on it the unique index will also be create automatically, right? I am wondering on this .

"Whether you need a clustered or heap table depends on the number of row, on the insert,delete and select patterns.
 A clustered index may be used instead of an non-clustered index, when for example other values must be read. Here a clustered index seek can be faster than a non-clustered index seek and a key-lookup."

you mean is number of row is small, then heap is good as SQL server can read the whole table faster than using index, right? and we save diskspace too.

so if number of row is small, like 1000-2000 rows, cluster index seek will be faster as SQL server just read the whole table into the buffer pool, right?

"failed at step 2 (make it right). "

what is step 2 I am sorry ?

ScottPletcher,

"This can be confusing if you don't do logical modeling first, before you ever consider physical modeling.  That's very common, to skip/ignore logical modeling, even though it's a huge mistake.
"

in here logic modelling is the data modellling, and physical model is the index definition, right?
0
ste5anSenior DeveloperCommented:
why we can't denormalize  it.
You can denormalize. But this is normally a model change. And a model change is always a change or refinement in the requirements. Thus, instead of just implementing this change, start with reviewing the requirements first. This step is necessary for different reasons, like documentation, communication and impact analysis.

I
like that, but it doesn't means that if we define the unique constraint on it the unique index will also be create automatically, right? I am wondering on this .
When I need a unique constraint as defined in the model I also simply declare in T-SQL a UNIQUE CONSTRAINT. SQL Server itself will create this constraint, by creating an unique index. In reverse creating unique indices as optimization step only happens when there exists an UNIQUE CONSTRAINT on the table and the UNIQUE INDEX is just a different, but equivalent way of expressing it. In other words: I don't create UNIQUE INDEXes often while optimizing (model change!).

what is step 2 I am sorry ?
Don Knuth (short version): Make it run, make it run, make it fast.
Long version (basically the entire book):
The real problem is that programmers have spent far too much time worrying about efficiency in the wrong places and at the wrong times; premature optimization is the root of all evil (or at least most of it) in programming.
Computer Programming as an Art (1974)
0
Scott PletcherSenior DBACommented:
in here logic modelling is the data modellling, and physical model is the index definition, right?

No.  Google "logical data modeling".  It's a completely separate process that is independent of any physical db.  Therefore, I should be able to take exactly the same logical model and create a corresponding physical model for SQL Server or Oracle or UDB or whatever.
1
ste5anSenior DeveloperCommented:
Additional to what @Scott wrote:

A logical data model has under normal circumstances exactly one physical model. The physical model is a derivation of the logical model.

The reverse process does not exist. You cannot create the logical model from the physical model.
0
marrowyungSenior Technical architecture (Data)Author Commented:
ste5an,

"Thus, instead of just implementing this change, start with reviewing the requirements first. This step is necessary for different reasons, like documentation, communication and impact analysis."

tks, very clear and I agree,

I can get most out of this topic by reading this: http://www.amazon.com/gp/product/1449316409?psc=1&redirect=true&ref_=ox_sc_act_title_6&smid=ATVPDKIKX0DER, right?

any another video and link you can suggest which tell me all that jazz so that I can learn the right thing all together ?

" SQL Server itself will create this constraint, by creating an unique index. In reverse creating unique indices as optimization step only happens when there exists an UNIQUE CONSTRAINT on the table and the UNIQUE INDEX is just a different, but equivalent way of expressing it. In other words: I don't create UNIQUE INDEXes often while optimizing (model change!).
"

very clear too, so this means when creating the unique constraint for the model, the unique index will be create automatically FOR THAT PURPOSE, right?

so in what  case  you experienced before the unique index is not the one you want which create trouble and need query tuning ? but if expressing it different and the SAME also means no need to touch/recreate them, right?


"you mean is number of row is small, then heap is good as SQL server can read the whole table faster than using index, right? and we save diskspace too.

 so if number of row is small, like 1000-2000 rows, cluster index seek will be faster as SQL server just read the whole table into the buffer pool, right?"

may I know your options here?

"The real problem is that programmers have spent far too much time worrying about efficiency in the wrong places and at the wrong times"

that's what I am doing now, ask you guys on the right options instead of spending too much time on sth wrong, sorry if I ask too much.

As I keep saying, you can give me a briefing and then show me some GOOD LINK for me to read. this save the time of both of us.

"The reverse process does not exist. You cannot create the logical model from the physical model. "

so must from logical model to physical model .

ScottPletcher,

this books :  http://www.amazon.com/gp/product/1449316409?psc=1&redirect=true&ref_=ox_sc_act_title_6&smid=ATVPDKIKX0DER,  can give a correct guide on what is logical model and physical model, right?

or other suggestion ?
0
ste5anSenior DeveloperCommented:
The book is one of the basic ones. I reommend it often. I don't know good videos, but guess you should search for the basic tasks:

Entity Relation Ship modelling, Object Role modelleing, Normalization.

This are two approaches, which complement the modelling. Normalization is the process of applying the relational theory to get a correct model. It's as Scott already said: These steps can be all performed on paper with a pencil. And it's a good idea to do this.

An existing unique index may not be the trouble, but it may not support (any) query. While it is necessary to enforce the data model, you need to create often indices to support the usage of your table. These additional indices are rarly unique ones.

Heap vs. clustered tables: Imho heap tables are only useful in edge cases.

instead of spending too much time on sth wrong

I have alread explained my approach: There are serveral roles. Depending on the project and or management structures someone needs to fulfill more than one.

Development role: Implementing the requirements in the given architecture.
DBA role: Optimizing and running an implementation. The only requirements a DBA needs to know are those non-functional about performance, scalability.
Architecural role: Designing the system components, it's boundaries, outlining the implementation steps. All according due to the requirements.

In the dev and DBA role it's important the get the sense when an concrete operation violates the role (dev thinking about performance or optimization, dba changing the model). Cause this is always when you need to switch in the higher role (or aske that person).

btw, there is an fourth role when we have incomplete or inconsistent requirements: Project manangment (requirements management). This where we need to get also in conntact with the customer (product owner).

This is all about software development processes. There are lots of books, papers and sites out there.
1
marrowyungSenior Technical architecture (Data)Author Commented:
"The book is one of the basic ones. I reommend it often."

ok then any intermedia and expert one ?  I am sorry I'd like to ask this.

"btw, there is an fourth role when we have incomplete or inconsistent requirements: Project manangment (requirements management). This where we need to get also in conntact with the customer (product owner).":

agree !

"This is all about software development processes"

that's what I am going to look at , but later.

tks for your help.
0
marrowyungSenior Technical architecture (Data)Author Commented:
ste5an and ScottPletcher,

any more suggestion on data modeling learning ? video and books ?
0
marrowyungSenior Technical architecture (Data)Author Commented:
sorry, any update for me ?
0
marrowyungSenior Technical architecture (Data)Author Commented:
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Microsoft SQL Server

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.