asked on

pros/cons of not using ID's for foreign keys

For a while I have been a strong devote to using unique, autoincrementing ID fields for foreign key relationships.

I have come across some instances where using a UNIQUE code may work as well or even better. For example:

state
--------
code (PK)
name

city
--------
name
state_code(FK)

where state.code is the states abbreviation (NY = New York). If I am just hitting the city table I can get the state abbreviation with one table hit. If I did this with ID's I would always have to join both. Alternatively, if I need the state.name, I can link via the state.code

So my question is, what are the different philosophies of ID's. I came across a post that said _never_ link through an ID, since that would limit you ever having disjointed DB servers which periodically sync.

Thoughts, opinions -- thanks

Guy Hengel [angelIII / a3]

Actually, when in a multi-server environment, you are better off with GUID instead of autoincriment. Although, GUID need more space than Autoincriment integer, they produce values that should be globally unique!

Now, for such small tables as states, where the list won't be that long (and won't change that much), such an overhead isn"t really necessary. Also, a state_code field of NY will be much easier to "read" than a long GUID...

Arji

I think the biggest arguement would be that all tables should have PKs regardless. I know SQL server hates it when you don't have a PK in a table and Access slows down terribly. Also, using integers as PK/FK is much faster because the server doesn't have to evaluate a text field to something useful in searching.

I've always structured my tables like this but I guess it's a matter of preference:

state
--------
StateID(PK) long integer
code
name

city
--------
CityID(PK) long integer
name
StateID(FK)

Also, this method tends to eliminate problems arising from someone not entering a valid state code.

Natchiket

Primary keys where the field (or fields) are inherently part of the data as in the situation above are known as Natural Keys or Intelligent Keys. Keys which are artificial as in autoincrement, GUID etc. are known as surrogate keys. I expect the debate as to which is better has been raging since the dawn of relational databases. My own philosophy is if a Natural Key obviously stands out as being appropriate then use it, otherwise go for a surrogate.

ASKER CERTIFIED SOLUTION

RHansonLinea

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

emilewong

Pros of using code in the column as primary code:-
1. You can eliminate a ID column in the table which reduce database size.
2. You dont have to use join when you are only want the abbreivation not the full name.
(e.g) when you retreive data into a dropdown list which only show States Code.
3. Code is more meaningful in SQL statement.
4. Easier to write in Select state using where(Like 'N%') clause. (NY, NJ) not two separate ID "IN [1001, 1004]"
5. Better Performance when using index. because most query retrieve meaningful data.
Select customerName from Customers where States like 'N%' and the data is stored
in sequence when using index on States. When using ID, NY and NJ maybe a apart
from each others.

But using Code as the primary only suitable for small size tables with static data which only have primary key with a single column.

Pros using ID.

1. When using multicolumns primary key (FName , LName). you cannot reference to it with the code.
2. It is hard for you to create manual code for large tables like 200000 products table.
3. Cannot guarantee Unique Code in large tables. ID can be assigned automatically.
4. Easier to identify the Data in term of time. Let say ID 1000-2000 should be very long ago because the last ID is 50000.

There should be more, we can use both Code and ID column for reference in FK to get a better result of database performance.