Normalizing a large table

Posted on 2009-05-20
Last Modified: 2013-11-29
Although I understand normalization, I have a large table that I am having a tough time wrapping my mind around on how to split up.  

This is a membership database for an association of doctors which contains over 50 tables already. Currently the Member table is linked to the OfficeInfo table via the MemID field, as one doctor may have several Offices. There is a combination Primary Key of MemID and OfficeType fields on the OfficeInfo table. The OfficeInfo table has over 100 fields and tracks office contact information, days of operation, hours of operation, languages spoken, procedures preformed, insurance providers taken, special needs, as well as other topics related to an office.  Every time I try to break this table down into smaller tables, I just end up with several tables that have a one-one relationship.  Is there something that I am not seeing?  If the only way to normalize this table is to create several one-one related tables, would I be better off just leaving this a one large table?

I have attached a list of fields for the OfficeInfo table if this will help.  I hope someone can get me pointed in the right direction on this one.
Question by:CALAOMS
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 8
  • 6
  • 4

Expert Comment

ID: 24434855
Without looking at the table structure, one clue is any fields that you end with an "s" suchs as "insurance providers" would be a good candiate for a new table.

Author Comment

ID: 24435022
Correct me if I am wrong. I think what you are suggesting is creating a lookup table, that way the insurance provider for the office can be selected from that list.  I agree that would be the way to go, if each office only utilized one insurance provider.  The problem is that each office may have many insurance providers, and we need to know which offices accept which insurance providers.

Assisted Solution

Drutch earned 100 total points
ID: 24435572
Hmm.. then you could use intermediate (mapping) tables, though you would want to weigh the benefits.
I'm not able to see your attached list of fields. Feilds and example data would be helpful.

If I had an OfficeInfo table and it was using a comma delimited list or multiple fileds to store Insurance Provider names I would most likely create an InsuranceProvider table and an OfficeInsuranceProvider table. OfficeInsuranceProvider would store the ID fields from InsuranceProvider and OfficeInfo.

The best gauage is to ask yourself if the field is a property of the entity. An insurance provider does not describe an office, but is an entity unto itself.
Free eBook: Backup on AWS

Everything you need to know about backup and disaster recovery with AWS, for FREE!

LVL 57
ID: 24435687
<<I have attached a list of fields for the OfficeInfo table if this will help.>>
Attach a list of the fields.
 And on top of what Drutch has already said, if you have fields that repeat (InsuranceProvider1, InsuranceProvider2, InsuranceProvider3, etc), then that is also an indication that it needs to be broken out into a seperate table.
  Not sure why your ending up with one to one tables, but your correct;; if they do end up that way, then there really is no need to break them out.  The only time you would do so is if your bumping into the 255 field limit per table in Access.  With a properly normalized design, that usually doesn't happen.

Author Comment

ID: 24436104
I have attached the list of fields again.  I think the problem last time was that the file was a .rtf, and must not be supported.

LVL 57
ID: 24440888
All these language fields:
          Portuguese                                                 Yes/No                                       1
          Persian                                                     Yes/No                                       1
          Filipino                                                     Yes/No                                       1
          Russian                                                     Yes/No                                       1
          Tagalog                                                    Yes/No                                       1
          Arabic                                                      Yes/No                                       1
          Korean                                                     Yes/No                                       1
          Gujarati                                                     Yes/No                                       1
          Latvian                                                     Yes/No                                       1
          Lithuanian                                                  Yes/No                                       1
  You need to break out into a seperate table.  A tip-off that your design is not normalized is if you need to change the table design when you want to add something.  ie.  If need to add another language, I would have to modify the design of this table and not simply add a record to a table.
 You also would have problems answering a question like "How many languages does each office support?"  Right now, you need to look at the individual fields in each office record, which SQL statements don't do well. SQL is designed to work with sets of records, not within the records themselves.
  So you'll need:
tblLanguages - One record per language
LanguageID - Autonumber - PK
LangName - Text
tblOfficeInfoLanguages - One record per Office / Language combination.
OfficeID - FK to tblOfficeInfo
LanguageID - FK to tblLanguages
  same deal with the services offered:

          ComputerImaging                                           Yes/No                                       1
          Phobias                                                     Yes/No                                       1
          Children                                                    Yes/No                                       1
          Holistic                                                     Yes/No                                       1

  break it out into a seperate table.
LVL 57
ID: 24440942
and BTW, you need to add a primary key on the office table (unless that is the name, which I would not use).
Last MemID only belongs in the OfficeInfo table if only one member can be related to this office. If an office can have more then one member associated with it, then you need to breakout that relationship into a separate linking table:
OfficeID - Autonumber - PK
MemberID - Autonumber - PK
OfficeID - Long - PK1A and FK to tblOfficeInfo
MemberID - Long - PK1B and FK to tblMembers
with this setup, each office could have many members and each member could have many offices, but each member/office combination is unique (can only be associated with each other once).
LVL 57
ID: 24440971
I just saw your primary key in the doc; is there one office record for each member (which now makes sense)?  if so, your present design in that area is somewhat correct, but we'd need to talk about that further.

Expert Comment

ID: 24442371
PhoneAC      Text      5
      Phone      Text      20
      FAXAC      Text      5
      FAX      Text      20
      BackLineAC      Text      5
      BackLine      Text      20
      PerToFax      Yes/No      1
      PerToPubFax      Yes/No      1
      Email      Text      50
      PerToEmail      Yes/No      1
      PerToPubEmail      Yes/No      1
      Email2      Text      50
      PertoEmail2      Yes/No      1
      PerToPubEmail2      Yes/No      1

This is contact info. A contact is an entity and as you can see you can have many contacts per office.

                     Mon      Yes/No      1
      Tue      Yes/No      1
      Wed      Yes/No      1
      Thur      Yes/No      1
      Fri      Yes/No      1
      Sat      Yes/No      1
These are DaysOfOperation, should be a separate table. Same applies to the hours of operation. Since multipler offices can have the same hours / days then you should use a mapping table.
     OfficeInfoPK              DaysOfOperationPK
    OfficeA                           1
    OfficeA                           2
    OfficeA                           4
    OfficeB                            2
    OfficeB                            3

JDettman addressed the Language table already.

Emergencies      Yes/No      1
      HospPriv      Yes/No      1
      Geriatrics      Yes/No      1
      CleftPal      Yes/No      1
      PhysHandi      Yes/No      1
      MentHandi      Yes/No      1
      WheelChair      Yes/No      1
      Hemophilia      Yes/No      1
      HouseCalls      Yes/No      1
      TMJ      Yes/No      1
      Implants      Yes/No      1
      Local      Yes/No      1
      GenDr      Yes/No      1
      GenAnesth      Yes/No      1
      IV      Yes/No      1
      NO2      Yes/No      1
      Acupunc      Yes/No      1
      Hypnosis      Yes/No      1
      FinAvail      Yes/No      1
      MedCal      Yes/No      1
      SenDent      Yes/No      1
      Notes      Memo      -
      Inactive      Yes/No      1
                      ComputerImaging      Yes/No      1
      Phobias      Yes/No      1
      Children      Yes/No      1
                     SignLang      Yes/No      1
                     Antisnoring      Yes/No      1
      Forensics      Yes/No      1
      PortableEquip      Yes/No      1
      DistrictCounty      Text      50
      CosSurgery      Yes/No      1
      OrthoSurgery      Yes/No      1
      Preprosthetic      Yes/No      1
      Trauma      Yes/No      1

I would call these "Offerings" or "Features"? I'm not in the medical field...maybe there is a more appropriate terminology for these.

LVL 57
ID: 24442609
<<This is contact info. A contact is an entity and as you can see you can have many contacts per office.>>
  Possibly, but if this contact info is for the office itself and not individuals within the office, then you might want to leave it as is.  Contact info is always one of thorner design problems.  
What nags at me is that the records in OfficeInfo have a PK based on the Office and the MemID, which would imply that this contact info belongs to a single person.  If that's the case, then it is really already broken out.  What would be needed then is a Office table with all the general info that applies to each office and the current table which represents a member in that that office.
  I would call it tblOffices and the current table tblOfficeMemberInfo or some such.
<<These are DaysOfOperation, should be a separate table. Same applies to the hours of operation. Since multipler offices can have the same hours / days then you should use a mapping table.>>
  Don't agree on that one.  I'd leave it as is.

Author Comment

ID: 24443480
This is starting to get a little clearer to me.  I kept trying to break this table into several tabels with one-many relationships and kept ending up with one-one relationships.  You know the old saying sometimes it is hard to see the forest through the trees, well Jim if I understand correctly what I have been missing is a join table so that I will end up with many-many relationships. Is that correct?
tblOffice                        tblOfficeLang                  tblLanguage
MemId PK <-----------  FK MemId  PK ---------->LangID PK
OfficeType PK             FK OfficeType  PK          LangName
OtherFields                 FK LangID PK

Second to answer your question I just saw your primary key in the doc; is there one office record for each member?  Each member may have one or more offices ie. OfficeType = 1-Primary, 2-Secondary, 3-Third, 4-Fourth, 5-Other, and 6-Home.  Hence the combined PK of MemID and OfficeType.  Your question also brought to mind another factor  Several members may have the same office, as in doctors sharing a practice. So these probably should be broken into multiple tables as above.
tblMembers  <----------------- tblMembersOffice ------------------->  tblOffice
Jim, both you and Drutch have points about the contact info.  The contact info in part belongs to the office, as you have stated, but in part belongs to the member as Drutch stated. Address, office phone and fax belong to the office,  cell number would belong to the member, and back office phone and email could belong to the office if only used by one doctor, but would belong to the member if multiple doctors share a practice as each could have their own number and email address.  In this instance I am thinking it might be best to associate back office phone and email with the member.  So at this point, would it be best to break out contact info into three tables, tblContactInfo, tblOfficeContact and tblMemberContact?
So I might end up with
tblMembers <----------------tblMemberContact -----------------> tblContactInfo
tblMembers <----------------tblOfficeContact -----------------> tblContactInfo
Drutch you are right, breaking out Procedures, days and hours of operation and insurance is absolutely correct.  I knew it had to be done, I just wasnt seeing how to do it correctly.
If both of you are correct, I am going to end up with quite a large number of tables with many-many relationships.  Wow, all those joins for the queries are going to make my head spin.    

Expert Comment

ID: 24443699
Since you are working with (I think you said MS Access?) you could create saved queries/views that are commonly used. That may help cut down the code everytime you create a new query. In SQL Server you would create views to do this.

EX: SELECT <some fields> FROM TABLE1 INNER JOIN TABLE2 ON <field> = <field> INNER JOIN TABLE3 ON <field> = <field>
Name your query for example vw_GetContactInfo
then you can join on the query rather than rememering every table and field for subsequent queries.

Once again like everything you have to weigh performance against convinience. And Access has a lot of overhead for every object created.
LVL 57

Accepted Solution

Jim Dettman (Microsoft MVP/ EE MVE) earned 400 total points
ID: 24443759
What's getting you messed up is the MemID in the office table.  it does not belong there.  Let's start from the begining and build up...
You have offices and members, so:
tblOffices - One record per Office location
OfficeID - Autonumber - PK

  MemberID does not belong in this table.  Only thing that belongs here is things that relate to the office.

tblMembers - One record per member
MemberID - Autonumber - PK
FirstName - Text
LastName - Text
  then to identify which members work at which offices:
tblOfficeMembers - One record per Office/Member combination
OfficeID - PK1A  - FK to tblOffices
MemberID - PK2A - FK to tblMembers
  You might also add some fields to this table that pertain to info that is related to this member being in that office (ie days/hours there, phone/extention & e-mail while in that office, etc).  

  As we've already identified, we'll also have:
tblServices - One record per service type
ServiceID - Autonumber - PK
ServiceName - Text

tblOfficeServices - One record per service offered per office
OfficeID - PK1A - FK to tblOffices
ServiceID - PK1B - FK to tblServices
same setup for languages...

tblLanguages - One record per language
LanguageID - Autonumber - PK
LanguageName - Text

tblOfficeLanguages - One record per language per office
OfficeID - PK1A - FK to tblOffices
LanguageID- PK1B - FK to tblLanguages

     Now let's say you needed to know what services a member offered while in a paticular office.  For example, the office may offer x-rays as a service (it has the machine), but the member is not a qualifed radiologist.  So we'd add:
 tblOfficeMemberServices - One record per Office/Member combination and service type
OfficeID - PK1A - FK to tblOffices
MemberID - PK1B - FK to tblMembers
ServiceID - PK1C - FK to tblServices
  and you would have a list of services this member would perform while at that office.   To simplify that table design a bit, most would change to this:
tblOfficeMembers - One record per Office/Member combination
OfficeMemberID - Autonumber - PK
OfficeID - CK1A  - FK to tblOffices
MemberID - CK2A - FK to tblMembers
  'CK' is canidate key - in other words it could work as a primary key, but we've decided to use something else in stead.  In this case, we've added a surrogate key (the autonumber), because it's shorter, so we do tblOfficeMemberServerices, we can do this:

 tblOfficeMemberServices - One record per Office/Member combination and service type
OfficeMemberID - PK1A - FK to tblOfficeMembers
ServiceID - PK1C - FK to tblServices

 I'll stop there for the moment and see if that makes sense.

LVL 57
ID: 24443791
<<Once again like everything you have to weigh performance against convinience. And Access has a lot of overhead for every object created>>
  Nothing we are discussing here will create any type of performance issue.  This is all stright f orward stuff.  Access will easily handle thousands of objects in a DB and tables with records up to a million or so.

Author Comment

ID: 24444324

I probably should have mentioned that I inherited this database some 8 years ago and have been administering it since.  I am at a point that I have developed it about as far as I can go with the current design. Now that time permits, I have decided to create a new app from the ground up.  The problem lies in that after you have looked and worked with something for so long, it becomes ingrained, and it is hard to look past what you know in your mind.

So to answer your question, YES, what you have laid out does make perfect sense. For the life of me, I could not get past how the tables are related in the current DB (albeit as poorly designed as they are) to see clearly how I should establish the new tables.  I know I needed a fresh set of eyes to just get me pointed in the right direction.  

What you have taught me and what I hope other in my situation who read this will learn is this.  Beyond determining what information is being collected by the old DB, I should not even think about it when designing the new DB.  My new strategy is to make a list of information/fields that are currently being collected in the old DB and to add to it fields that we should be, but are not currently collecting. Then to build every table as if this was a new DB for a new client, and not the recreation that it is. This way I can create it as correctly as my ability allows, and not base it on something that was massively flawed to begin with.  Once the backend is created, then and only then should I go back to the old DB and salvage what code I can for the new frontend app.
Jim and Drutch,  I really appreciate your insight and direction, in helping me with this issue.


Author Closing Comment

ID: 31583649
Thanks again to you both.  Although Drutch mentioned the solution first, it still was not clear to me. JDettman clearly pointed out how the table should be split, showing new table desings with PKs and FKs, So I have awarded both points.  Now it's time to dust off Visio and get to designing.
LVL 57
ID: 24445562
  Please feel free to come back to this question if you have additional questions on this design.
  Keep in mind that you want a separate table for each "thing" (Office, Member, Appointments, etc).  Fields in that table should pertain only to that thing (phone number in the Office table is the Office phone #).  If you need to establish an association between two things, you are going to need a joining table.
 Once you get this done, your going to find a lot of things come very easily in terms of developing this.  For example, setting up a main/subform combination with the main form being for the Office and a subform for the services offered will be a snap because the design is normalized.
  So it is going to be the opposite of what you're expecting with the queries; things will get easier to do.
  and it's great that your starting off with a clean slate on this.  You'll end up with a better app overall and will have something that is easy to maintain and expand down the road.

Author Comment

ID: 24445748

Although I feel the fog shrouding my brain has been lifted, I just may take you up on your offer to come back to the question.  I'm heading out on vacation tomorrow and I am going to try to not to think about it until I come back, so it will be a week or two before I may have additional questions.
Thanks again

Featured Post

Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Access developers frequently have requirements to interact with Excel (import from or output to) in their applications.  You might be able to accomplish this with the TransferSpreadsheet and OutputTo methods, but in this series of articles I will di…
This article describes a method of delivering Word templates for use in merging Access data to Word documents, that requires no computer knowledge on the part of the recipient -- the templates are saved in table fields, and are extracted and install…
With Microsoft Access, learn how to specify relationships between tables and set various options on the relationship. Add the tables: Create the relationship: Decide if you’re going to set referential integrity: Decide if you want cascade upda…
With Microsoft Access, learn how to start a database in different ways and produce different start-up actions allowing you to use a single database to perform multiple tasks. Specify a start-up form through options: Specify an Autoexec macro: Us…

738 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question