Want to win a PS4? Go Premium and enter to win our High-Tech Treats giveaway. Enter to Win

x
?
Solved

Primary key - autonumber VS text

Posted on 2013-12-20
11
Medium Priority
?
567 Views
Last Modified: 2013-12-22
Can anyone clarify and explain the pros and cons of having a primary key that is text VS a primary key that is auto number –

Also – which could be used in what possible situations – for example joing many to many junction tables –
Example – there is a table called [country] with fields – countryName,countryID like this
countryName | countryID
USA | US
Canada | CN


There is a [customer] table with fields > customerName,customerAddrss like this
customerName | customerAddrss
Rony | Redmond WA
Tony | Bellevue WN

Between country and customer – its many to many so …..

Thank you
0
Comment
Question by:Rayne
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
11 Comments
 
LVL 75

Accepted Solution

by:
DatabaseMX (Joe Anderson - Microsoft MVP, Access and Data Platform) earned 200 total points
ID: 39733128
Disclaimer:
You will be opening a can of worms with this question :-)
===

There is virtually no justification for using anything but the Access Auto Number for the Primary Key of any Table. It's the simplest and most elegant approach with the least amount of headaches.

m
0
 
LVL 76

Assisted Solution

by:GrahamSkan
GrahamSkan earned 200 total points
ID: 39733130
It is largely as matter of convenience as well as fitting with convention.

Autonumbers are self-handling. You can, or course, set a text ID to be a unique index, and that would work just as well, however it might require some extra verification code.
0
 
LVL 75
ID: 39733131
typo:
"This is " >> "There is ..."
0
Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 75
ID: 39733219
thx. Might as well delete the 'typo' comment then.
0
 
LVL 13

Assisted Solution

by:sameer2010
sameer2010 earned 400 total points
ID: 39733307
If we use autonumber, business logic of de-duplication is to be handled by the programmer. So if it is something like a lookup code (country code in your example) that is standardized, it may not make sense adding auto number. However, if you want to carry that as a key in multiple tables, then perhaps adding a number would help.
On the other hand, if we have a table that stores anything that comes its way e.g. upload table, which does not care about content of column data, but it is to be worried by other apps, autonumber would well go with it (may be with other important column like timestamp, etc.)

Having said this, it all depends on how you want to use the column being auto numbered. As a rule of thumb, if it is not going to be a foreign key anywhere else (or at limited places), then it would be better to go without number.
0
 
LVL 58

Assisted Solution

by:Jim Dettman (Microsoft MVP/ EE MVE)
Jim Dettman (Microsoft MVP/ EE MVE) earned 400 total points
ID: 39733505
<<You will be opening a can of worms with this question :-)>>

 Well you got that part right.

<<There is virtually no justification for using anything but the Access Auto Number for the Primary Key of any Table. It's the simplest and most elegant approach with the least amount of headaches. >>

 That's not entirely true and not always the case.

 To answer the question directly, the difference is one of performance and that's why meaningless keys (an autonumber) and surrogate keys are used.  But the very fact that an autonumber is meaningless does yield problems.

 Read through the following and see if it all makes sense:

http://www.experts-exchange.com/Microsoft/Development/MS_Access/A_2041-The-great-PK-debate-Natural-Keys-vs-Surrogates-again.html

Jim.
0
 
LVL 31

Assisted Solution

by:hnasr
hnasr earned 400 total points
ID: 39733511
Adding:
Auto-number VS text depends on context.

Pros: Allows typing, thoughts to be analyzed later.
1 my city
2 mu city
3 new city
4 remove spaces in entries
5 I have to clean my data to allow for proper querying.

Cons: Accepts duplicate concept, but different expressions.
1 tomatos
2 tomat
3 tomato
4 ketcup

But you can use auto-number, and set unique indexes for some fields to check for duplication.

If you are counting visitors, auto-number goes fine. If you want to spot repeating visitors, unique index the field which specifies the relevant visitor.
0
 
LVL 85

Assisted Solution

by:Scott McDaniel (Microsoft Access MVP - EE MVE )
Scott McDaniel (Microsoft Access MVP - EE MVE ) earned 400 total points
ID: 39733512
You're asking about "Natural" versus "Synthetic" Primary keys. As Databasemx said, you'll have quite a few differing opinions on this one!!

DISCLAIMER: This is really more of a personal choice than anything else, and the opinions stated below are my own. Take them as you wish, but realize there are plenty of people out there who have much different opinions on this, and who are very passionate about this topic. I'm not that passionate about it ( :) ), so I'm not going to get involved in the Natural vs Surrogate debate.

In general, the Primary Key should be a value that can be used to uniquely identify a specific record. The value can be a single field, or could be comprised of multiple fields. The PK value should be one that would not change, but that is not an absolute necessity, since you could certainly just update all Child records with those changes.

A "Natural" key is one that has meaning to the data. A good example of that would be a US Social Security Number, an Email Address, a TelephoneNumber, ect. Those bits of data would be unique to a specific person, and you could easily identify a person based on that. For example, if someone phones and asks about the status of their Order, and you've used the Email address as the PR for your Customer table, then you could ask the caller for their Email Address and you'd be able to find all Orders related to that Email address. You could not do that directly with a Surrogate key without joining to the Customer table with the Surrogate key.

A bad example of that would be data which may easily change, like Telephone_Number or EamilAddress. That can change for various reasons, and therefore would not be a great choice for a Primary Key in many cases.

You can also have a Natural Key that is comprised of multiple columns, like First_Name+Last_Name+Phone_Number. However, in my opinion this would also be a poor choice since (a) there are likely to be more than one person with exactly the same First_Name+Last_Name+Phone_Number, and (b) there are likely to be persons with NO Phone_Number and (c) those values could easily change.

The disadvantage to the Natural PK is that you must also ensure that your field or fields are indeed unique to the data, and you must also store all of the values in the related tables, which can increase the size of your database (as well as the size of the Indexes of your database). You must also be very, very sure that the Natural key you choose is the appropriate one for you business rules. Also, if your business rules change, you could end up with some very invasive changes to the database, and the use of Natural Keys could become difficult to manage. For example, if you start out only working with US-based customers, and you use the SSN field as the Primary Key, and you then expand your business to also include non-US customers, your SSN field becomes somewhat useless for those non-US customers. If you instead had used a Surrogate Key, you could simply make the SSN field to be not required and continue on as before.

The advantage to using Natural keys is that it uniquely identifies a row of data with no other information required, and it mirrors the "real world" nature of that data - in other words, it mimics exactly the way humans interact with that specific object. For example, if I were trying to identify a specific person in my real-world office, I could ask them for a copy of their Social Security card, which would uniquely identify them in a US-based system. If I used Surrogate keys, I could not ask them for their "Surrogate Key Number", since they'd have no idea what that might be. It also makes searches some easier, since you would not need to join other tables to that "child" table in order to find records associated with the "parent" data. For example, if I'm storing the Social_Security_Number in a "child" table containing Order information, and I need to retrieve all Orders for a specific Person, I can just search on the Order_Header.Customer_SSN field to find those Orders. If I'm instead used a Surrogate key, I'd have to Join the Order_Header table to the Customer table, and then Search on the Customer.Social_Security_Number field. Over time, as the database grows, this could result in poor performance.

A "Synthetic" or "Surrogate" Primary Key is one that has no meaning to the data (i.e. an AutoNumber field). The only purpose of that Synthetic key is to provide a handle to that specific row of data. The data actually contained in the row is what makes it unique, and the AutoNumber value is of no meaning. If you use Surrogate keys, you should also include at least one other Unique Index that is comprised of one or more Fields in the table, and that Unique Index should distinctly define a specific record. For example, in an Order_Header table, perhaps the "uniqueness" is defined by the Customer_ID+Order_Date+ShipTo_Address+PO_Number. If so, then you'd create a multi-field Index with those 4 fields, along with a PK AutoNumber field, and the multi-field index would insure data uniqueness, while the PK field would be used when relating that record to other tables (such as the Order_Details table).

The advantage to using Surrogate keys is that your related tables are not impacted by changes to the parent data's "uniqueness data". For example, if I'm using a Natural_Key defined as Customer_ID+Order_Date+ShipTo_Address+PO_Number, and I find that I've made an error when entering the PO Number for a Customer's Order, I can just change that PO value in the Order_Header, and any related records in Order_Details would immediately pick up those changes without the need to make changes in the Order_Details table. Had I instead used those 4 fields as my Primary Key in the Order_Header table, then I'd have to update data in (a) the Order_Header table and (b) the Order_Details table. If I somehow forgot to make changes in the Order_Details table when changing those values in Order_Header, then I'd end up with "orphaned" records in Order_Details.

The disadvantage to using Surrogate keys is that you must add new columns and indexes to tables, which can increase the size and "bloat" of your database, and which could possibly impact performance. Also, as mentioned earlier, you must Join the Parent table in order to search on the unique values of the Parent table. IMO this is a valid trade-off, especially with modern machines
0
 
LVL 31

Expert Comment

by:hnasr
ID: 39733525
Adding:
If you read a number or a text and visualize it in your head, then ir is not a proper key for computer software. It is a primary key for human manual record keeping.

A key is not meant to be interpreted any furthur in our heads, the computer is supposed to relief us from that extra burden.

Meaningful data comes into other table fields.

Telephone numbers, for example, are not used as primary keys, in general, they are used as extra fields in table for search purposed to find the proper key for that record. They, as LSMConsulting mentioned, may change.

As LSMConsulting started, a can is opened :).
Ending: If you need an auto-number key use it for the main table. The related foreign keys should be numeric.
0
 

Author Closing Comment

by:Rayne
ID: 39734195
Thank you All,

This is really helpful.

Thank you for also correcting my English, excuse my french
0

Featured Post

NEW Veeam Agent for Microsoft Windows

Backup and recover physical and cloud-based servers and workstations, as well as endpoint devices that belong to remote users. Avoid downtime and data loss quickly and easily for Windows-based physical or public cloud-based workloads!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

If you need a simple but flexible process for maintaining an audit trail of who created, edited, or deleted data from a table, or multiple tables, and you can do all of your work from within a form, this simple Audit Log will work for you.
An alternative to the "For XML" way of pivoting and concatenating result sets into strings, and an easy introduction to "common table expressions" (CTEs). Being someone who is always looking for alternatives to "work your data", I came across this …
The viewer will learn how to simulate a series of sales calls dependent on a single skill level and learn how to simulate a series of sales calls dependent on two skill levels. Simulating Independent Sales Calls: Enter .75 into cell C2 – “skill leve…
With Secure Portal Encryption, the recipient is sent a link to their email address directing them to the email laundry delivery page. From there, the recipient will be required to enter a user name and password to enter the page. Once the recipient …

636 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question