Avatar of zachvaldez
zachvaldezFlag for United States of America

asked on 

Allow duplicates in a table

I have the following structure of a table.

PartID
DivID
Name
Address
City
State
Postal
Position
TNotes

currently the key is DivId, I also have PartID to set to autocrement by 1.

Id like to allow duplicate DIVID in this table. What is best way to handle it while maintaining the DivID as key.

should I remove the key in DivID and no key to this table?
Microsoft SQL Server 2008

Avatar of undefined
Last Comment
zachvaldez
Avatar of Jim Horn
Jim Horn
Flag of United States of America image

>currently the key is DivId
>Id like to allow duplicate DIVID
The above two statements are mutually exclusive.  If DIVID can contain duplicate values, then it can not be a key.   You can index it with duplicates allowed to improve performance, but no key.

>should I remove the key in DivID and no key to this table?
Yes, or at least just remove the key in DivID.
Avatar of Beartlaoi
Beartlaoi
Flag of United States of America image

If PartID is a unique auto-increment then it should be the primary key.
You can place an index on DivID and anything else to speed up lookups.
Avatar of Jim Horn
Jim Horn
Flag of United States of America image

>I also have PartID to set to autocrement by 1.
If you mean this is an AutoNumber field, then this would be a candidate for a key.
Avatar of zachvaldez
zachvaldez
Flag of United States of America image

ASKER

How about making those 2 fields as keys or another field paired with DivID. Is that possible?
Avatar of Jim Horn
Jim Horn
Flag of United States of America image

Yes, but...
(1)  If PartID is guaranteed to be unique, then it could be a key by itself.
(2)  DivID + any other field(s) would have to guarantee uniqueness in order to be a composite key.
Avatar of Beartlaoi
Beartlaoi
Flag of United States of America image

A table can have only one primary key, which is unique.
A table can have many foreign keys, which do not have to be unique since they reference a unique record in some other table.  Foreign keys have an index linked to them to speed up references.
Do you have another table that has DivID as a primary key?  If not just make DivID an index.
Avatar of zachvaldez
zachvaldez
Flag of United States of America image

ASKER

I have several tables with DIVID as primary key. In fact as previously suggested, I removed DIVID as primary key in the tables taht would allow duplicate DIVIDs. I hope that would be ther best approach and made the autoincrement field as the key instead.
When you say "key", I guess you mean "primary key" and "clustering key"?!

Keep in mind that your primary key and your clustering key do NOT have to be the same.

If the existing table performs well with the DivID as the clustering key, that alone can still be your clustered key if you prefer (SQL allows you to specify a duplicate clustering key, and it "uniquifies" each row's key internally itself).

But primary keys must be unique.  So change the pk, likely to (DivID, PartID).

The clustering key could be the same, or just DivID alone, depending on what's best for that table and its values.
Avatar of Beartlaoi
Beartlaoi
Flag of United States of America image

If you will be looking up records based on DivID then you should create an index containing just DivID.

So if you are doing this:
SELECT * from MyTable WHERE DivID=12345

Then you should have:
CREATE INDEX IX_MyTable_DivID ON MyTable(DivID)

In the Management Studio when you are designing your table, the toolbar button you want has the gold key and the window, usually two buttons to the right of the gold key button.  This form allows you to manipulate all the keys and indexes on your table.
Avatar of zachvaldez
zachvaldez
Flag of United States of America image

ASKER

I made the Pk of the table as PartID.

>>>But primary keys must be unique.  So change the pk, likely to (DivID, PartID).
Does this suggest I make 2 pks for the table?
Avatar of zachvaldez
zachvaldez
Flag of United States of America image

ASKER

The column DivID has a property indicating its indexable  set to 'yes' . Is that suffice as a property setting or shouls I still do the suggestions...

>>>>CREATE INDEX IX_MyTable_DivID ON MyTable(DivID)

In the Management Studio when you are designing your table, the toolbar button you want has the gold key and the window, usually two buttons to the right of the gold key button.  This form allows you to manipulate all the keys and indexes on your table. >>>>>
Avatar of Beartlaoi
Beartlaoi
Flag of United States of America image

You will notice that that property cannot be changed.  It is for your information, it tells you if the column can be part of an index, based on its type.
It does not tell you if the column is in one or more indexes.
You do need to hit the button Manage Indexes and Keys User generated image or from the Table Designer menu choose Indexes/Keys... and then Add a new index.
In the (General) section select the Columns and hit the ... button to pick the columns you want on it, just DivID.
Down in the Identity section change the name to IX_MyTable_DivID
Dont need to change anything else just hit close then save the table.
Avatar of zachvaldez
zachvaldez
Flag of United States of America image

ASKER

Ok, for indexes I'll set the DivID but for primary key , PartID will still be the primary key.
Avatar of zachvaldez
zachvaldez
Flag of United States of America image

ASKER

When I click the manage Indexes and keys button, I see there the PK_partId.
should I add another entry there, which would make 2 keys?
Avatar of Beartlaoi
Beartlaoi
Flag of United States of America image

Add another one which will make one key and one index.
Avatar of zachvaldez
zachvaldez
Flag of United States of America image

ASKER

Got it!
One more thing,
Can you summarize in short what these 2 entries would do or make efficient the table>
just for future references of questioners.

thanks
Avatar of Beartlaoi
Beartlaoi
Flag of United States of America image

PK_partId is the primary key, it is a unique (probably clustered) index that is used when you want to specifically identify exactly one record in the table.  Clustered means that the order of records stored in the database are in the order of this index.  The main power of the clustered index is that its faster when you use > < and BETWEENs operations since it just has to step to the next or previous record.

IX_MyTable_DivID is a normal index.  When you want to find records using a DivID it searches the index rather than going through every row in the table.  The index is organized in a b-tree fashion so it doesn't have to take too many steps to find the row(s) you are looking for.  Think about the index in a book, it allows you to quickly find an important concept in a book, same for an index on a table.
Again, the DivID column should remain your clustered index!

The primary key should be either PartID alone or ( DivID, PartID ), depending on your data requirements.

A unique value like PartID works great as a nonclusted index.  Clustering by DivID will give better performance (assuming the table was reasonably keyed originally).
Avatar of zachvaldez
zachvaldez
Flag of United States of America image

ASKER

Where can I see the setting for clustered or nonclustered? Is this something set manually. What is the default setting? Is this a must thing to do need to watch for,care for,aware of?
SOLUTION
Avatar of Beartlaoi
Beartlaoi
Flag of United States of America image

Blurred text
THIS SOLUTION IS ONLY AVAILABLE TO MEMBERS.
View this solution by signing up for a free trial.
Members can start a 7-Day free trial and enjoy unlimited access to the platform.
See Pricing Options
Start Free Trial
Avatar of zachvaldez
zachvaldez
Flag of United States of America image

ASKER

By setting the PK PartId its Clustered index to "No" and making Divid set clustered to Yes would achieve what favorable advantage and efficiency?
>> Is this a must thing to do need to watch for,care for,aware of? <<
IF you want good performance from tables, yes.  


>> Clustering is only needed when you partition a table. <<
That's 100% false.


Btw, the GUI is just too flaky to do critical tasks like creating indexes.  Especially on (very) large tables, where it may fail every time.

While line commands can be a pain if you're not comfortable with them, you really need to use SQL statements to add keys and indexes, like so:
ALTER TABLE ... ADD CONSTRAINT creates a primary key.
CREATE INDEX creates an index.


So, to get some practice, if your existing table isn't too large, drop the existing keys and use the command below.

Based on the initial q, I'm assuming the PartID wasn't indexed before -- if it wasn't indexed before, it doesn't need to be separately indexed now.

So, for now, let's say all you need is a clustering, primary key that matches your data requirements:


ALTER TABLE dbo.tablename
    ADD CONSTRAINT PK_DivID_PartID --<<-- this is the constraint name
    PRIMARY KEY CLUSTERED ( DivID, PartID )
    WITH ( FILLFACTOR = 100, SORT_IN_TEMPDB = ON )


You can adjust (or remove) FILLFACTOR if you need to.  I suggest always using the other parameter because it can help prevent your database files from over-growing just because of index builds/rebuilds.

You can get rid of the constraint, for example to add it back with different key columns, like so:

ALTER TABLE dbo.tablename
    DROP CONSTRAINT PK_DivID_PartID
Avatar of Beartlaoi
Beartlaoi
Flag of United States of America image

If thats the only column that you are specifying in your WHERE and JOINS then it can help.  Go ahead and do that.
Avatar of zachvaldez
zachvaldez
Flag of United States of America image

ASKER

I have about 25 tables and 6 of these tables require that DivId should be allowed to accept duplicates.Divid in the rest of the tables in the db is the Pk. So is it critical to ser the indices using code?
The clustering on DivID helps if you (almost) always specify it in a WHERE clause; it doesn't have to be the only column.  Quite the opposite, in fact: the more you specify other columns in WHERE clauses, the more useful the clustered index is at restricting the rows that have to be searched.  Indeed, that's why it was made the clustered index.
Avatar of Beartlaoi
Beartlaoi
Flag of United States of America image

The indexes and keys will be automatically used by the query engine optimizer, you dont have to do anything special to get their benefits.  They just need to exist when you do your queries.
If the indexes are already created and they will continue to work you are fine.

But if DivID is defined as a true PK, SQL will not allow duplicate values to be added.  You will need to change the PK to add a column to insure uniqueness to leave it as a PK.  In that case, I would use the code, because typically the statement for one table can be copied, the table name changed, added column name changed if needed, and the next table is done.

The other big advantage of the command is that it forces you to understand what you are doing!  That seems like a disadvantage, but it really isn't.  If you're going to be responsible for creating SQL constraints/indexes, you really need to understand what they are and what they do, each type of them.
Avatar of zachvaldez
zachvaldez
Flag of United States of America image

ASKER

I agree . Knowing the ins and outs separates the men from the boys. Thanks for all your inputs.
Avatar of Beartlaoi
Beartlaoi
Flag of United States of America image

Make sure to learn from the proper source, read this article about clustered indexes.
MSDN - Clustered Index Design Guidelines
ASKER CERTIFIED SOLUTION
THIS SOLUTION IS ONLY AVAILABLE TO MEMBERS.
View this solution by signing up for a free trial.
Members can start a 7-Day free trial and enjoy unlimited access to the platform.
Avatar of zachvaldez
zachvaldez
Flag of United States of America image

ASKER

I beleive I received excellent responses
Microsoft SQL Server 2008
Microsoft SQL Server 2008

Microsoft SQL Server 2008 is a suite of relational database management system (RDBMS) products providing multi-user database access functionality.Component services include integration (SSIS), reporting (SSRS), analysis (SSAS), data quality, master data, T-SQL and performance tuning. Major improvements include the Always On technologies and support for unstructured data types.

50K
Questions
--
Followers
--
Top Experts
Get a personalized solution from industry experts
Ask the experts
Read over 600 more reviews

TRUSTED BY

IBM logoIntel logoMicrosoft logoUbisoft logoSAP logo
Qualcomm logoCitrix Systems logoWorkday logoErnst & Young logo
High performer badgeUsers love us badge
LinkedIn logoFacebook logoX logoInstagram logoTikTok logoYouTube logo