Access 2007: Table Relationships Normalization Assistance:

Hello, several members of this site have been helping me try to make the attached form Normalized.  I've made many changes, but I'm still not there yet.  In particular, I'm not sure I understand how relationships should be created between the tables.  Before, I simply had several 'large, non-normalized tables' which were related to one another based on the unique identifier [Call Number].
Now, I have many smaller, hopefully normal, tables.  

The relationships should be as follows:  
Each table, with the exception of tables which exist only to populate combo/list boxes and do not have [call number], is related by the identifier call number.  
The table tblPtCharacteristics will be modified first, and after this, patients will move onto tblGroup and describe subsets of the population in tblPtCharacteristics.
TblGroup is related to the subsequent tables of tblRaceDesc, tblDisease, and  tblStageCatDesc.
through the [Group] field.
TblDiseaseDesc is also related to tblStageCatDesc because it will describe a specific [Group], and [Disease].
I'm attaching an image of how these are laid out now in the relationships window (I am very new to this part so any help would be greatly appreciated.)

In addition to these relationships i have 12 'unrelated' forms that are used to populate some of the fields in this form via lookup values in table design.  How should these be handled in normalization?

As always thanks again for all the help and advice,
Bevos relationships EE-ABS-Example.accdb
Who is Participating?
David L. HansenProgrammer AnalystCommented:
Ok, I'll honestly try my best to be brief and to the point, but what needs to be explained might take a few paragraphs...please read all of it carefully!

For the moment forget about normalization, one-to-many relationships, etc.  First, let me just give you a useful perspective on database design from a broader point of view.

Your database design begins with describing (in normal English) what it is you are doing (pretend you are describing your project to your next door neighbor).  The post you made describing medical books and treatment groups was quite good.  When you take that description, on paper, and mark the nouns you'll get a pretty good idea of what your tables need to be.  The idea is you want to separate those nouns into self-contained tables.  Books really deserves it's own table; having specific medications mixed into the books table just doesn't make sense.  To start with just start with the first four or five major nouns and think about how they relate.  One book (a single row entry) could tie to several treatment groups right?  Sure it could.  That is a one-to-many relationship.  But wait a second, this is important....don't write that relationship down so fast.  Let's consider a single row in the Treatment_Groups table; will a given row in that table relate ONLY to one row in the books table?  If so, then you've got a one-to-many relationship, if not, you need a bridge table (that's what I call them).  It's just a table that acts as a middleman.  Books and Treatment_Groups would both have one-to-many relationships to that bridge table (think students and classes).  Do this with 4 or 5 major nouns, don't bite off more than that.  Then consider adding more after you've described those relationships.  And you don't have to use all the nouns, common sense will guide you, and normalization principles.

Normalization.  Here it is.  When normalizing you don't think up and down through the table data, instead you think horizontally through the table columns (attributes is what they are called among database folks).  The idea is if you have only attributes/columns that describe books and nothing else.  Also, you don't let one column be a derivation of another (for example "age" in a Student is derived from their changes base on another column) that is to be avoided.

Primary keys, these are the identifiers that uniquely define a row in a table.  Now you could just auto-number every table with an ID.  That sort of defeats the purpose however.  Sometimes you need to use an auto-numbered ID, and that's ok, be sure to ask yourself first though "Is there a natural primary key I can use".  For example, people - social security number; cars - VIN number, books - ISBN.  A year or so down the row you may be glad you used a natural primary key when your database needs to expand and you find yourself relating three for four new tables to what you've already got.  Quiz...what would a bridge table have for it's primary key?  Well, you'd just use the two primary keys from the two other tables (remember its just a middleman).  So for students and classes you'd end up with a Class_Roster table where StudentID and ClassID make up the primary key (this is called a concatenated key...two columns together making up the primary key).

See how far that takes you...good luck.

If tables are unrelated, then by definition, you could consider them normalized or that it is not possible to normalize them.
David L. HansenProgrammer AnalystCommented:
Talk to me a bit about your application/project.  Medical study?  Historical archiving?  Just help me understand the big big broad picture (all technical points layed aside).  This is important before I can help you normalize.
Network Scalability - Handle Complex Environments

Monitor your entire network from a single platform. Free 30 Day Trial Now!

I think a good start would be to tell us what exactly is gonna be stored in each table.
I don't think the idea of putting the same identifier (CallNumber) in each table is a good one... Each table should have it's own identifying column (or columns), i.e. if the table tblGroup is supposed to store only group data, then you should have a GroupID column that stores the identifier of each group. This applies to all the other tables you mentioned that were used to populate data into dropdowns and so.

After that, you'll have to start relating the tables based on your needs... Like I don't see a direct relationship between characteristics and group (although I don't completely know your needs).

How are you going to handle the tables that are used to populate comboboxes? Well you'll have to store the identifier of that table in the table you will be using it. Let's say you wanna use a group, you store the groupID (like in the screenshot) into the table and the group information is stored in a different one.

There are quite some things we could do to the database, but if we don't know exactly what you want, it will be hard for us to start.
BevosAuthor Commented:
Hello everyone thanks for your timely and informative responses.  To address a question raised by some people "What is the purpose of this database?"  

The purpose is to track data from academic literature on medical conditions (primarily cancer) for subsequent analysis.

So the unit of measurement in this is the [Call Number] which corresponds uniquely to each piece of academic literature.

The other variables listed in this table describe characteristics of the literature pertaining to the patient population.  So within each article there are several groups, defined by [Group].

This could be treatment/control or other distinguishing aspects of the study (this bunch of patients got 1.5g of medication x, etc...) and these can vary extensively by study.

The other tables contain information about each group, so for example tblRaceDesc will have a description of each race within a group (i.e. in [call number] 00001 we have a treatment [group] composed of 10 Hispanic persons and 10 White persons).

I've tried to remove all redundant data to make the forms normal and [Call Number] is how each table will be related.  The goal after all of this is done is to control tables based on the sum of article information collected.  For example you should report the number of persons with a [disease] being stage IV in a group of studies etc...  The hope is that if data is collected in this way it can make it easier to write up a paper.  

Please let me know if I should try to explain more, or did a poor job explaining.

Any advice is helpful and thanks again, Bevo

Jeffrey CoachmanMIS LiasonCommented:
First create the tables *With* primary keys.
(Or is the issue here how to identify the key fields?)

Do not attempt to relate these tables until you are certain that they are properly formed.
This way you can determine the "Type" of relationship (One to many, Many to many, On to one, ...etc)
BevosAuthor Commented:
Hi Boag, I think the relationship between each of the tables and  tblptCharacteristics will be 'one to many.'  Each of these related tables can have many rows describing different groups and characteristics of groups for each [call number] but there will only be 1 entry in the master tblPtCharacteristics.

On the subject of primary key, I didn't know if I could have a primary key in each table as [Call Number] because in the other tables, for example tblDiseaseDesc, there might be many of the same [Call Number].  Because of this, should I make another field, such as an auto number field, to be the primary key for these tables?  Please let me know what would be good practice.

Thank you,
I think you should add an additional column to each of the tables you mention. For example, the table tblGroup should have a unique identifier (it could be an autonumber field), and so on with each of the tables. That's what boag was saying when he said "create the tables with primary keys".

On the other hand you should really consider if the "CALL NUMBER" is going to be present in each and every one of your tables. Like, if you say the tblRaceDesc is related to each group, then there's no reason for you to make a relationship with tblPtCharacteristics. You only need to relate tblPtCharacteristics with tblGroup and tblGroup with tblRaceDesc. If you do this, you will be able to know the racedesc data only by knowing the tblPtCharacteristics, because of the relationship... Did I explain myself pretty well???

Let me know if I need to be a bit more clear with my explanation.. What I mean is that you still need to work on designing your tables and primary key fields, in order to find the relationships between your tables.
David L. HansenProgrammer AnalystCommented:
Here is an additional resource.  Don't feel like you have to use it, but if you want to delve further into any of the points made above, dig in!
BevosAuthor Commented:
Thanks for the great advice everyone! I've had exams so I haven't been able to look at this post until today.  I spent some time thinking about what was said and redesigned my tables.  Call Number is now only present in the tblGroup which relates to the primary table tblPtCharacteristics.  4 other tables (tblMale, tblAgeDesc, tblRaceDesc, tblDiseaseDesc) describe groups currently and are associated with the [GroupNumber] attribute in tblGroup.  1 table now describe disease which is stage/category which is related to the primary key of tblDiseaseDesc [DiseaseID].   I created a new primary key for each of the subordinate tables to tblGroup and enforced-referential integrity on a one-to-many relationship.

Right now, I see the user going through this table first entering the characteristics of the book/paper in tblPtCharacteristics.  Next, they would define groups in tblGroup (i.e. Adult Control Pts, Adult Experimental Pts, Senior Citizen Control Pts... or any other example).  Then they would select a group and enter other attributes, racial composition, age, ratio men to women, diseases stages (with further description on category/stage).  

My big question now is do you think I've normalized this thus far?  Does the data relate to one another like I've described?  At the end, I want to generate an excel or word table output that is filtered on some of the categories (like disease, group, study design etc).  The data would be nested such that it would be:
Call Number contains Groups which contain attributes describing the patient and disease characteristics.

Here is an image of the associated relationships.

Thanks again for all the help!

David L. HansenProgrammer AnalystCommented:
I would not agree with keeping study-specific details in the table with book-specific information.  Author, call number, etc. belongs in a table for books/publications.  Type of study, Time frame of study, etc. belongs in another table.  You can add a groupID to the book/publication table and get rid of the call number in the groups table.  That being said, if you ever have more than one study of interest in a single book, you are sunk.  So, to prepare for such an eventuality just drop in a bridge-table (TblStudy perhaps) to bridge books and groups.  It would have call number and group id attributes (note: you would leave the group_id out of the book table in this case and still remove the call number from the groups table).  Those two keys in the study table would tie any given study to its corresponding book and group.
BevosAuthor Commented:
Hi sl8rz, thanks for the information.  For this database book/study are equivalent things just different terms.  So I don't think I need to change that first table (each of those categories will correspond 1:1 with [Call Number]).  Based on your feedback I'll close the question now.  I'm bound to have a bunch of other issues as this thing develops and I'd be very welcoming to any of the commenters' continued feedback.


Bevo S.
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.