Solved

No primary key for table benefits

Posted on 2013-06-22
28
321 Views
Last Modified: 2013-07-01
Hey guys, I've for 3 tables.

1) code definition - primary key = code ID
2) activity - primary key = activity ID
3) code storage - no primary key

Each activity has 1 or more codes and these codes are stored in the code storage field. As such, code storage has two foreign keys for activity is and code id. From my knowledge of primary keys and indexing, I don't need a primary key for code storage as no records need to be unique and also Access forces an index on this artificially created code storage ID field.

Questions for your expert opinion!! = ))
1) can I go without a primary key here?
2) I think no primary key and indexing on activity id is actually the best cause I'll be querying the table a lot on activity id - it's a trade off the reading and writing but I guess I can do some real world testing moving forward.
0
Comment
Question by:developingprogrammer
  • 15
  • 6
  • 5
  • +2
28 Comments
 
LVL 84
ID: 39267923
also Access forces an index on this artificially created code storage ID field.
We have no idea what you mean by the above, since we don't have any idea what "code storage ID field" is or what it's supposed to do. Access doesn't force an index unless you use that field in a relationship (i.e. foreign key, basically), or if you set the field as the Primary Key.

IMO every permanent table should have a Primary Key, regardless of the intended use. Unless you're talking about hundreds of thousands of data rows (or more) I doubt you'll see much performance troubles, but the only way to tell this is by testing.
0
 
LVL 57
ID: 39267940
<< From my knowledge of primary keys and indexing, I don't need a primary key for code storage as no records need to be unique and also Access forces an index on this artificially created code storage ID field. >>

  If you've done the table design right, every table will have a primary key.

Jim.
0
 
LVL 57
ID: 39267942
You also need to be clear about what you mean when you ask about "primary keys".  There are two different contexts where "primary key" is used and they are not one in the same.

You may want to give this a read through:

http://www.experts-exchange.com/Microsoft/Development/MS_Access/A_2041-The-great-PK-debate-Natural-Keys-vs-Surrogates-again.html

Jim.
0
 
LVL 22

Expert Comment

by:Thomasian
ID: 39267957
>> I don't need a primary key for code storage as no records need to be unique

Is it possible to have 2 records with the same values for "code id" and "activity id" on the "code storage" table?

If that is possible, can you share more info about your table structure?

If not, you can use a composite key (i.e. both "code id" and "activity id") as the primary key.
0
 

Author Comment

by:developingprogrammer
ID: 39268005
Thanks Thomasian - yup you got it!! It is possible for me to have 2 tupples which are exactly the same.

I read Jim's article ok PK (thanks once again Jim, fantastic fantastic article!!) and on hindsight I was thinking that I should make the table as

1) Activity ID
2) Code ID
3) Number of times code ID appeared.

So that way I won't have any duplicate tupples.

However I thought that because 99.99% of records will only have 1 code so adding an additional field takes up memory - even just a byte a row adds up right?

What do y'all think guys? Am most happy to hear your guidance!! = ))
0
 
LVL 24

Expert Comment

by:Bitsqueezer
ID: 39268028
Hi,

because I thought it's the better place to write it there I added a comment to Jim's very good article about PKs he posted above.

Maybe you want to read that also and then think about PKs again.

And to add something: Nowadays the last thing you should think about is "how much memory takes it to save this or that", the only exception is large binary or text object fields. Memory was never as cheap as today.

Cheers,

Christian
0
 

Author Comment

by:developingprogrammer
ID: 39268061
thanks Christian!

i read your article and coincidentally enough before i started on my new project i went through a database optimisation course. in it, i explored the concept of clustered and non-clustered index, which once comes in the left most field should it be a composite index etc. so yes, definitely what you're saying is very important and something i'm considering as well. i will be doing real scenario tests as well with and without indexing when my DB is up and running.

yup also heard about the memory saving issue and you're absolutely right. hrmm i'm just erring on the side of caution now trying to train myself in memory optimisation too - but you are totally right and i will definitely bear in mind what you've said! = )

could you advise me on this - in my below picture, what if in an activity i can have 2 codes? then what should my PK be in the CodeStorage table? i want to use natural keys and i want to eschew from "meaningless keys" / "surrogate keys" as much as possible

jim perhaps you could share your thoughts as well cause your article has influenced my design quite a bit! (though i'm sure i'll make learning mistakes as i go along!)

Relationships
0
 
LVL 57

Assisted Solution

by:Jim Dettman (Microsoft MVP/ EE MVE)
Jim Dettman (Microsoft MVP/ EE MVE) earned 200 total points
ID: 39268098
One of the things you need to understand to now that you've read the article on keys is that from a performance standpoint, any database engine will have more options in performing operations if it has at least one unique index on a table.

In the case of a transaction table, that may take the form an autonumber (Identity column) labeled as the "primary key".

 Your code definition table looks like it needs adjustment.  In the other thread, I made the comment about repeating fields.  Let me ask you a few questions:

1. What if something needed four categories?

2. What if I wanted to count the number of items in a specific category?

3. What if I wanted to make a list of all the categories used?

  With your current design, you'd find answering those questions difficult because you'd need to look at every record and check the three individual fields that you have.

 What you want is this:

tblCodeDefinition
CodeID
Type
MinutesAwarded
Complexity

tblCodeDefinitionCategories
CodeID - PK1
CatID - PK2, also Foreign key to tblCategories
WorkFlowSequence

tblCategories
CatID - PK
Description

  Now the problem with question #1 goes away, because you simply need to add more records to tblCodeDefinitionCategories if a Code Definition has more then three.   Now it can have any number.

 Problem #2 goes away and becomes more efficent because I can now simply count records, rather then having to go into each record and look at seperate fields.

  Keep in mind that SQL (query language) is designed to work with sets of records, not fields.

  Problem #3 also becomes quite easy as I can simply look at the CatID field in tblCodeDefinitionCategories and return a list where each category is listed once (a SELECT DISTINCT).

Jim.
0
 
LVL 24

Expert Comment

by:Bitsqueezer
ID: 39268107
Hi,

that's exactly what I meant in my comment to Jim's article: I personally would ALWAYS add an autonumber PK field to th m:n table, here "CodeStorage". Using "CodeID" and "ActivityID" as meaningful keys is OK in case where it is used in communication between people also, like in the example of the CustomerID. Otherwise maybe there are other attributes which better identify an activity or code, I cannot say because I don't know what the background of these tables is.

But I would definitely NOT create 3 pairs of fields in the "CodeDefinition" table. Maybe you are sure today that you always will only have 3 categories for each code definition - but maybe in 2 years your boss comes to you and says: "Ah, we now need 5 categories, could you please add them?" - then yor are in big trouble because you will have to heavily change your frontend and table design to make that possible. So here you need an additional table "CodeDefinitionCategories" (or whatever you like) which consists of the fields "CodeID" (as foreign key), "Cat" and "CatWorkflowSequence". That's of course more work to handle in frontend design, but in the end when your boss asks you to add two categories you simply can say "Do it on your own, you can add as many categories as you want at any time and you can use only the needed number of categories in each code definition, from zero to infinite!".

The same in the "ActivityStorage" table: You should always use ONE table for ALL kind of persons for any purpose. The same table can be filled with customers, suppliers and so on. This table consists of columns like "PersonID", "Account", "PersonName" and so on and in the "ActivityStorage" table you would only have to add one field: PersonID. This can also be used for further fields like "CreatedByWhom", "AssignedToInitially": No text fields, ID fields which points to the people table. THAT is the place where you should always look on to save memory (independent of how cheap it is) because this is what makes the difference in performance and handling later.

Cheers,

Christian
0
 

Author Comment

by:developingprogrammer
ID: 39268108
Whao super cool ok let me go digest what you just said Jim and also look through that YouTube video. Get back to you soon!! Thanks!! = ))
0
 

Author Comment

by:developingprogrammer
ID: 39268119
Great!! Thanks Christian too!! Let me get back to y'all when I've digested this! Almost bedtime here!! = ))
0
 

Author Comment

by:developingprogrammer
ID: 39269138
hey guys, thanks for your comments. i've read through them and digested them and - yup yall are absolutely 100% correct that i should further normalise my CodeDefinition table, but due to some business logic and other factors i've actually denormalised it back to this design.

but definitely what yall are saying is very important and i'm applying it to all the other parts of the DB. this is kinda my second go at rewriting this whole database so i'm picking out the points where i could have done better also which suits my weaknesses so to speak.

question!
1) what should the natural key of my CodeStorage be?

what i think is i need to add in 1 more field which says "How Many Time Code Appears" - of course in a more concise wording.

it's something like one order could have 2 products of the same kind. so in the OrderProduct table i should have OrderID, ProductID, ProductQuantity.

is this the right way to go about it? can i do without the CodeQuantity / ProductQuantity field? Please guide me guys!! thanks so much for you time and advice!! = ))

(btw my relationship table looks something like this now - WIP still, very inchoate!)WIP
0
 

Author Comment

by:developingprogrammer
ID: 39269200
yea guys the more i think about it i think the right way to go is to add in the quantity field = )
0
 

Author Comment

by:developingprogrammer
ID: 39269201
oh and sometimes i denormalise things for readability sake too = ) that's why my team members is going by windowsID not by an autonumber PK - which i used in the 1st version of this programme
0
Backup Your Microsoft Windows Server®

Backup all your Microsoft Windows Server – on-premises, in remote locations, in private and hybrid clouds. Your entire Windows Server will be backed up in one easy step with patented, block-level disk imaging. We achieve RTOs (recovery time objectives) as low as 15 seconds.

 
LVL 24

Accepted Solution

by:
Bitsqueezer earned 300 total points
ID: 39269398
Hi,

it's a little bit strange that you always try to save memory by creating fields with less bytes and on the other hand you supply fields like a Windows ID as key where the string is saved in both tables and waste memory...
But of course you can do that, a Windows ID is normally unique in a domain. But also you maybe want to think about that it COULD be possible that your IT department decides to use the same Windows ID again for another person later if someone leaves the company and another one enters later with a similar or equal name and then your ID isn't unique anymore because then you refer to two different persons in your database with the same ID (that's one of these moments in design of a database where I always use an autonumber PK because it only SEEMS to be a good idea to use a natural key here and then, very much later, you see that it was not and now you must think about a redesign which in such cases is nearly impossible in a running database).

I've attached a little demo database where I made another table model based on your's to show you what I meant with the categories.
Your table model makes me believe that you maybe want to list three completely different kind of categories which have no intersection between each other. So I've extended the model by adding a category type table which you can use to distinguish between three (or more) category lists. If all these categories are from the same list you wouldn't need that. Please notice that in this case (where you are using different category lists) the m:n table needs an additional field to separate the chosen lists. That gives you the possibility to create a "NoDuplicate" index (see the index list) which avoids that you create more than one category for one CodeDefinitionFull entry from one category list.

Next is the assignment of addresses or persons to your ActivityStorage table. The really needed fields I demonstrated in the attached demo, there is no need to save for example an account number there or a customer name. You also don't need an additional table for team members with details about name and so on. I've found a really good description of an address and person model which can be applied to the most databases and also would work here:

Address Database Design

(Don't forget to read the text under the data model in detail, it's important that you need some special records in some tables to make that work.)

With this model you should be able to maintain team members in the same way as customers or suppliers or any other category of peoples/addresses/companies with a maximum flexibility.

You should also read the second page:

Database Expansion

of this site which shows how to implement the model in a real world database which also contains a user account table like in your model.

I didn't integrate that in my demo, that would take more time than I have for this...:-)
But I'm sure you got the idea and can integrate that in your database on your own.


I'm not really sure what you mean with your quantity fields. If it is to determine how many records of one type you have then of course you would not integrate such a field into your table because any data which can be calculated should be calculated and not saved (exception is when the backend is a database server like SQL Server which offers the possibility to create calculated columns where the result will be saved, this can help to get a better performance in queries later).
If you mean quantity in the meaning of a classic order ("I want to buy product X three times") then of course you would need a quantity field where you can enter the number of wanted products by the customer.

Cheers,

Christian
DemoDB.zip
0
 

Author Comment

by:developingprogrammer
ID: 39269564
whao Christian!! i really really really appreciate all the effort you're going through just to help me!! let me read through your post thoroughly to do justice to the extent you're going to help me learn = )) when i've digested, implemented and tested all the concepts, let me get back to you!! thanks so much Christian!! = )))
0
 

Author Comment

by:developingprogrammer
ID: 39270259
okie Christian, i read through everything you said quite a few times and went through both webpages as well. great wealth of knowledge!! i typed out quite a lengthy reply but i thought better show through actions than words ha = )

the only thing i'm not sure is:

1) if i define a byte field and leave it blank, when i compact and repair the database will it still occupy 1 byte or because it's null it won't occupy 1 byte? this impacts my CodeQuanity field - whether to add it in or not. cause some times will need this some times won't = )

and as you can see from my BE DB i've adopted quite a few of your best practices!! = ))
DMS-Cerebro---SSB---Recon.mdb
0
 
LVL 57
ID: 39270935
<<1) if i define a byte field and leave it blank, when i compact and repair the database will it still occupy 1 byte or because it's null it won't occupy 1 byte? this impacts my CodeQuanity field - whether to add it in or not. cause some times will need this some times won't = )
>>

  Your worried too much about space.    

  With the size tables that a JET DB can support, this is not a concern.   If you were in another database system and had a table with millions of rows (like 10 or greater), then consideration for field sizes would start to come into play.

 But to answer your question; all the fields in a JET DB except, memo, OLE's, and text fields are fixed sizes.  Doesn't matter what value is there or if it's a null.

Jim.
0
 

Author Comment

by:developingprogrammer
ID: 39270979
cool thanks Jim!! = )
0
 
LVL 57
ID: 39271104
What you should also understand is that one of the reasons that field sizes in JET don't matter to a certain extent is that it stores records on fixed page size and it won't let a record span a page.

While that scheme makes database operations simpler for the DBMS system, it also means that trying to control size is not as workable as one might think.

It doesn't mean you should go crazy with field sizes and simply use a double for every number, but by the same token if you had a table with eight longs vs eight byte fields, your not going to see much difference in the database size.

 Having a real need to use a byte field is harkening back to the days when memory and disk space was measured in MB rather then GB.   The cases where you would really need to now a days are few and far between.

Jim.
0
 

Author Comment

by:developingprogrammer
ID: 39272704
yup Jim i completely agree with you. i guess all the tutorials i've been reading from the start have been emphasising on this a lot.

however one of the constraints is a slow network even though memory may be available.

sorry Jim could you expand a little bit on the records being stored on a page?

1) so is it one record one page?
2) and is the page size is a standard for ALL records or does it look at the fields and say - ok this table needs X page size as compared to Y page size for the other table?

thanks in advance Jim!! = )
0
 
LVL 24

Expert Comment

by:Bitsqueezer
ID: 39272808
Hi,

I've sorted the chaotic data model a little bit in the attachment. You should always try to create a data model without crossing lines as good as possible, makes it more readable.

Although your model looks a little bit better now I would say there are significant errors in the relations because multiple tables are linked together on multiple ways which mostly is not needed. But I cannot say more because I really cannot see what the target or sense of this database should be.
Moreover I think it would get far away from your initial question to discuss your data model in this thread.

For your other questions about the background of what is saved in pages I think Jim can answer that better than me.

Cheers,

Christian
DMS-Cerebro---SSB---Recon.zip
0
 
LVL 57
ID: 39272994
<<sorry Jim could you expand a little bit on the records being stored on a page? >>

  We are talking about using JET (data is in a MDB or ACCDB file).

<<1) so is it one record one page? >>

  No, it may be multiple records per page, but a record (except for OLE and Memo fields) must fit on one page.

  So at a minimum, your always going to be fetching one page of data regardless of the record size.

<<2) and is the page size is a standard for ALL records or does it look at the fields and say - ok this table needs X page size as compared to Y page size for the other table?>>

 Same page size 4096 bytes with text stored as unicode (two bytes per character).

Jim.
0
 

Author Comment

by:developingprogrammer
ID: 39273549
Whao super cool man Jim!

Let me get back to you soon Christian!! = )
0
 

Author Comment

by:developingprogrammer
ID: 39273618
hi christian, ok i've gone through the DB you've sent me and i've reorganised my relationship layout. thanks so much for all your help!!!! i really really really appreciate all the time you've spent helping me out!!

i am very interested in what you said about the data model having significant errors due to multiple tables being joined together in multiple ways, so i'm creating another question and referencing it here for one and all to see (i think this thread is really really good in terms of all the fantastic info Jim and you shared! = )   )

the new question is: data modelling best practices

guys once again, thank you so much for all your help!!!! yall are superb!!! = ))
0
 
LVL 24

Expert Comment

by:Bitsqueezer
ID: 39273763
No problem, you're welcome...:-)
0
 

Author Comment

by:developingprogrammer
ID: 39273921
= ))
0
 

Author Comment

by:developingprogrammer
ID: 39289222
bitsqueezer i'm once again desperately working through night and day to keep my  job, but i wanna say something - you were absolutely right about always adding a autonumber PK. it has come in critically so many times. thanks for your wonderful help and spot on advice once again = )
0

Featured Post

Get up to 2TB FREE CLOUD per backup license!

An exclusive Black Friday offer just for Expert Exchange audience! Buy any of our top-rated backup solutions & get up to 2TB free cloud per system! Perform local & cloud backup in the same step, and restore instantly—anytime, anywhere. Grab this deal now before it disappears!

Join & Write a Comment

In the article entitled Working with Objects – Part 1 (http://www.experts-exchange.com/Microsoft/Development/MS_Access/A_4942-Working-with-Objects-Part-1.html), you learned the basics of working with objects, properties, methods, and events. In Work…
Regardless of which version on MS Access you are using, one of the harder data-entry forms to create is one where most data from previous entries needs to be appended to new records, especially when there are numerous fields and records involved.  W…
Get people started with the utilization of class modules. Class modules can be a powerful tool in Microsoft Access. They allow you to create self-contained objects that encapsulate functionality. They can easily hide the complexity of a process from…
Polish reports in Access so they look terrific. Take yourself to another level. Equations, Back Color, Alternate Back Color. Write easy VBA Code. Tighten space to use less pages. Launch report from a menu, considering criteria only when it is filled…

746 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

14 Experts available now in Live!

Get 1:1 Help Now