• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 626
  • Last Modified:

Normalizing a large table

Although I understand normalization, I have a large table that I am having a tough time wrapping my mind around on how to split up.  

This is a membership database for an association of doctors which contains over 50 tables already. Currently the Member table is linked to the OfficeInfo table via the MemID field, as one doctor may have several Offices. There is a combination Primary Key of MemID and OfficeType fields on the OfficeInfo table. The OfficeInfo table has over 100 fields and tracks office contact information, days of operation, hours of operation, languages spoken, procedures preformed, insurance providers taken, special needs, as well as other topics related to an office.  Every time I try to break this table down into smaller tables, I just end up with several tables that have a one-one relationship.  Is there something that I am not seeing?  If the only way to normalize this table is to create several one-one related tables, would I be better off just leaving this a one large table?

I have attached a list of fields for the OfficeInfo table if this will help.  I hope someone can get me pointed in the right direction on this one.
  • 8
  • 6
  • 4
2 Solutions
Without looking at the table structure, one clue is any fields that you end with an "s" suchs as "insurance providers" would be a good candiate for a new table.
CALAOMSAuthor Commented:
Correct me if I am wrong. I think what you are suggesting is creating a lookup table, that way the insurance provider for the office can be selected from that list.  I agree that would be the way to go, if each office only utilized one insurance provider.  The problem is that each office may have many insurance providers, and we need to know which offices accept which insurance providers.
Hmm.. then you could use intermediate (mapping) tables, though you would want to weigh the benefits.
I'm not able to see your attached list of fields. Feilds and example data would be helpful.

If I had an OfficeInfo table and it was using a comma delimited list or multiple fileds to store Insurance Provider names I would most likely create an InsuranceProvider table and an OfficeInsuranceProvider table. OfficeInsuranceProvider would store the ID fields from InsuranceProvider and OfficeInfo.

The best gauage is to ask yourself if the field is a property of the entity. An insurance provider does not describe an office, but is an entity unto itself.
Improve Your Query Performance Tuning

In this FREE six-day email course, you'll learn from Janis Griffin, Database Performance Evangelist. She'll teach 12 steps that you can use to optimize your queries as much as possible and see measurable results in your work. Get started today!

Jim Dettman (Microsoft MVP/ EE MVE)PresidentCommented:
<<I have attached a list of fields for the OfficeInfo table if this will help.>>
Attach a list of the fields.
 And on top of what Drutch has already said, if you have fields that repeat (InsuranceProvider1, InsuranceProvider2, InsuranceProvider3, etc), then that is also an indication that it needs to be broken out into a seperate table.
  Not sure why your ending up with one to one tables, but your correct;; if they do end up that way, then there really is no need to break them out.  The only time you would do so is if your bumping into the 255 field limit per table in Access.  With a properly normalized design, that usually doesn't happen.
CALAOMSAuthor Commented:
I have attached the list of fields again.  I think the problem last time was that the file was a .rtf, and must not be supported.

Jim Dettman (Microsoft MVP/ EE MVE)PresidentCommented:
All these language fields:
          Portuguese                                                 Yes/No                                       1
          Persian                                                     Yes/No                                       1
          Filipino                                                     Yes/No                                       1
          Russian                                                     Yes/No                                       1
          Tagalog                                                    Yes/No                                       1
          Arabic                                                      Yes/No                                       1
          Korean                                                     Yes/No                                       1
          Gujarati                                                     Yes/No                                       1
          Latvian                                                     Yes/No                                       1
          Lithuanian                                                  Yes/No                                       1
  You need to break out into a seperate table.  A tip-off that your design is not normalized is if you need to change the table design when you want to add something.  ie.  If need to add another language, I would have to modify the design of this table and not simply add a record to a table.
 You also would have problems answering a question like "How many languages does each office support?"  Right now, you need to look at the individual fields in each office record, which SQL statements don't do well. SQL is designed to work with sets of records, not within the records themselves.
  So you'll need:
tblLanguages - One record per language
LanguageID - Autonumber - PK
LangName - Text
tblOfficeInfoLanguages - One record per Office / Language combination.
OfficeID - FK to tblOfficeInfo
LanguageID - FK to tblLanguages
  same deal with the services offered:

          ComputerImaging                                           Yes/No                                       1
          Phobias                                                     Yes/No                                       1
          Children                                                    Yes/No                                       1
          Holistic                                                     Yes/No                                       1

  break it out into a seperate table.
Jim Dettman (Microsoft MVP/ EE MVE)PresidentCommented:
and BTW, you need to add a primary key on the office table (unless that is the name, which I would not use).
Last MemID only belongs in the OfficeInfo table if only one member can be related to this office. If an office can have more then one member associated with it, then you need to breakout that relationship into a separate linking table:
OfficeID - Autonumber - PK
MemberID - Autonumber - PK
OfficeID - Long - PK1A and FK to tblOfficeInfo
MemberID - Long - PK1B and FK to tblMembers
with this setup, each office could have many members and each member could have many offices, but each member/office combination is unique (can only be associated with each other once).
Jim Dettman (Microsoft MVP/ EE MVE)PresidentCommented:
I just saw your primary key in the doc; is there one office record for each member (which now makes sense)?  if so, your present design in that area is somewhat correct, but we'd need to talk about that further.
PhoneAC      Text      5
      Phone      Text      20
      FAXAC      Text      5
      FAX      Text      20
      BackLineAC      Text      5
      BackLine      Text      20
      PerToFax      Yes/No      1
      PerToPubFax      Yes/No      1
      Email      Text      50
      PerToEmail      Yes/No      1
      PerToPubEmail      Yes/No      1
      Email2      Text      50
      PertoEmail2      Yes/No      1
      PerToPubEmail2      Yes/No      1

This is contact info. A contact is an entity and as you can see you can have many contacts per office.

                     Mon      Yes/No      1
      Tue      Yes/No      1
      Wed      Yes/No      1
      Thur      Yes/No      1
      Fri      Yes/No      1
      Sat      Yes/No      1
These are DaysOfOperation, should be a separate table. Same applies to the hours of operation. Since multipler offices can have the same hours / days then you should use a mapping table.
     OfficeInfoPK              DaysOfOperationPK
    OfficeA                           1
    OfficeA                           2
    OfficeA                           4
    OfficeB                            2
    OfficeB                            3

JDettman addressed the Language table already.

Emergencies      Yes/No      1
      HospPriv      Yes/No      1
      Geriatrics      Yes/No      1
      CleftPal      Yes/No      1
      PhysHandi      Yes/No      1
      MentHandi      Yes/No      1
      WheelChair      Yes/No      1
      Hemophilia      Yes/No      1
      HouseCalls      Yes/No      1
      TMJ      Yes/No      1
      Implants      Yes/No      1
      Local      Yes/No      1
      GenDr      Yes/No      1
      GenAnesth      Yes/No      1
      IV      Yes/No      1
      NO2      Yes/No      1
      Acupunc      Yes/No      1
      Hypnosis      Yes/No      1
      FinAvail      Yes/No      1
      MedCal      Yes/No      1
      SenDent      Yes/No      1
      Notes      Memo      -
      Inactive      Yes/No      1
                      ComputerImaging      Yes/No      1
      Phobias      Yes/No      1
      Children      Yes/No      1
                     SignLang      Yes/No      1
                     Antisnoring      Yes/No      1
      Forensics      Yes/No      1
      PortableEquip      Yes/No      1
      DistrictCounty      Text      50
      CosSurgery      Yes/No      1
      OrthoSurgery      Yes/No      1
      Preprosthetic      Yes/No      1
      Trauma      Yes/No      1

I would call these "Offerings" or "Features"? I'm not in the medical field...maybe there is a more appropriate terminology for these.

Jim Dettman (Microsoft MVP/ EE MVE)PresidentCommented:
<<This is contact info. A contact is an entity and as you can see you can have many contacts per office.>>
  Possibly, but if this contact info is for the office itself and not individuals within the office, then you might want to leave it as is.  Contact info is always one of thorner design problems.  
What nags at me is that the records in OfficeInfo have a PK based on the Office and the MemID, which would imply that this contact info belongs to a single person.  If that's the case, then it is really already broken out.  What would be needed then is a Office table with all the general info that applies to each office and the current table which represents a member in that that office.
  I would call it tblOffices and the current table tblOfficeMemberInfo or some such.
<<These are DaysOfOperation, should be a separate table. Same applies to the hours of operation. Since multipler offices can have the same hours / days then you should use a mapping table.>>
  Don't agree on that one.  I'd leave it as is.
CALAOMSAuthor Commented:
This is starting to get a little clearer to me.  I kept trying to break this table into several tabels with one-many relationships and kept ending up with one-one relationships.  You know the old saying sometimes it is hard to see the forest through the trees, well Jim if I understand correctly what I have been missing is a join table so that I will end up with many-many relationships. Is that correct?
tblOffice                        tblOfficeLang                  tblLanguage
MemId PK <-----------  FK MemId  PK ---------->LangID PK
OfficeType PK             FK OfficeType  PK          LangName
OtherFields                 FK LangID PK

Second to answer your question I just saw your primary key in the doc; is there one office record for each member?  Each member may have one or more offices ie. OfficeType = 1-Primary, 2-Secondary, 3-Third, 4-Fourth, 5-Other, and 6-Home.  Hence the combined PK of MemID and OfficeType.  Your question also brought to mind another factor  Several members may have the same office, as in doctors sharing a practice. So these probably should be broken into multiple tables as above.
tblMembers  <----------------- tblMembersOffice ------------------->  tblOffice
Jim, both you and Drutch have points about the contact info.  The contact info in part belongs to the office, as you have stated, but in part belongs to the member as Drutch stated. Address, office phone and fax belong to the office,  cell number would belong to the member, and back office phone and email could belong to the office if only used by one doctor, but would belong to the member if multiple doctors share a practice as each could have their own number and email address.  In this instance I am thinking it might be best to associate back office phone and email with the member.  So at this point, would it be best to break out contact info into three tables, tblContactInfo, tblOfficeContact and tblMemberContact?
So I might end up with
tblMembers <----------------tblMemberContact -----------------> tblContactInfo
tblMembers <----------------tblOfficeContact -----------------> tblContactInfo
Drutch you are right, breaking out Procedures, days and hours of operation and insurance is absolutely correct.  I knew it had to be done, I just wasnt seeing how to do it correctly.
If both of you are correct, I am going to end up with quite a large number of tables with many-many relationships.  Wow, all those joins for the queries are going to make my head spin.    
Since you are working with (I think you said MS Access?) you could create saved queries/views that are commonly used. That may help cut down the code everytime you create a new query. In SQL Server you would create views to do this.

EX: SELECT <some fields> FROM TABLE1 INNER JOIN TABLE2 ON <field> = <field> INNER JOIN TABLE3 ON <field> = <field>
Name your query for example vw_GetContactInfo
then you can join on the query rather than rememering every table and field for subsequent queries.

Once again like everything you have to weigh performance against convinience. And Access has a lot of overhead for every object created.
Jim Dettman (Microsoft MVP/ EE MVE)PresidentCommented:
What's getting you messed up is the MemID in the office table.  it does not belong there.  Let's start from the begining and build up...
You have offices and members, so:
tblOffices - One record per Office location
OfficeID - Autonumber - PK

  MemberID does not belong in this table.  Only thing that belongs here is things that relate to the office.

tblMembers - One record per member
MemberID - Autonumber - PK
FirstName - Text
LastName - Text
  then to identify which members work at which offices:
tblOfficeMembers - One record per Office/Member combination
OfficeID - PK1A  - FK to tblOffices
MemberID - PK2A - FK to tblMembers
  You might also add some fields to this table that pertain to info that is related to this member being in that office (ie days/hours there, phone/extention & e-mail while in that office, etc).  

  As we've already identified, we'll also have:
tblServices - One record per service type
ServiceID - Autonumber - PK
ServiceName - Text

tblOfficeServices - One record per service offered per office
OfficeID - PK1A - FK to tblOffices
ServiceID - PK1B - FK to tblServices
same setup for languages...

tblLanguages - One record per language
LanguageID - Autonumber - PK
LanguageName - Text

tblOfficeLanguages - One record per language per office
OfficeID - PK1A - FK to tblOffices
LanguageID- PK1B - FK to tblLanguages

     Now let's say you needed to know what services a member offered while in a paticular office.  For example, the office may offer x-rays as a service (it has the machine), but the member is not a qualifed radiologist.  So we'd add:
 tblOfficeMemberServices - One record per Office/Member combination and service type
OfficeID - PK1A - FK to tblOffices
MemberID - PK1B - FK to tblMembers
ServiceID - PK1C - FK to tblServices
  and you would have a list of services this member would perform while at that office.   To simplify that table design a bit, most would change to this:
tblOfficeMembers - One record per Office/Member combination
OfficeMemberID - Autonumber - PK
OfficeID - CK1A  - FK to tblOffices
MemberID - CK2A - FK to tblMembers
  'CK' is canidate key - in other words it could work as a primary key, but we've decided to use something else in stead.  In this case, we've added a surrogate key (the autonumber), because it's shorter, so we do tblOfficeMemberServerices, we can do this:

 tblOfficeMemberServices - One record per Office/Member combination and service type
OfficeMemberID - PK1A - FK to tblOfficeMembers
ServiceID - PK1C - FK to tblServices

 I'll stop there for the moment and see if that makes sense.

Jim Dettman (Microsoft MVP/ EE MVE)PresidentCommented:
<<Once again like everything you have to weigh performance against convinience. And Access has a lot of overhead for every object created>>
  Nothing we are discussing here will create any type of performance issue.  This is all stright f orward stuff.  Access will easily handle thousands of objects in a DB and tables with records up to a million or so.
CALAOMSAuthor Commented:

I probably should have mentioned that I inherited this database some 8 years ago and have been administering it since.  I am at a point that I have developed it about as far as I can go with the current design. Now that time permits, I have decided to create a new app from the ground up.  The problem lies in that after you have looked and worked with something for so long, it becomes ingrained, and it is hard to look past what you know in your mind.

So to answer your question, YES, what you have laid out does make perfect sense. For the life of me, I could not get past how the tables are related in the current DB (albeit as poorly designed as they are) to see clearly how I should establish the new tables.  I know I needed a fresh set of eyes to just get me pointed in the right direction.  

What you have taught me and what I hope other in my situation who read this will learn is this.  Beyond determining what information is being collected by the old DB, I should not even think about it when designing the new DB.  My new strategy is to make a list of information/fields that are currently being collected in the old DB and to add to it fields that we should be, but are not currently collecting. Then to build every table as if this was a new DB for a new client, and not the recreation that it is. This way I can create it as correctly as my ability allows, and not base it on something that was massively flawed to begin with.  Once the backend is created, then and only then should I go back to the old DB and salvage what code I can for the new frontend app.
Jim and Drutch,  I really appreciate your insight and direction, in helping me with this issue.

CALAOMSAuthor Commented:
Thanks again to you both.  Although Drutch mentioned the solution first, it still was not clear to me. JDettman clearly pointed out how the table should be split, showing new table desings with PKs and FKs, So I have awarded both points.  Now it's time to dust off Visio and get to designing.
Jim Dettman (Microsoft MVP/ EE MVE)PresidentCommented:
  Please feel free to come back to this question if you have additional questions on this design.
  Keep in mind that you want a separate table for each "thing" (Office, Member, Appointments, etc).  Fields in that table should pertain only to that thing (phone number in the Office table is the Office phone #).  If you need to establish an association between two things, you are going to need a joining table.
 Once you get this done, your going to find a lot of things come very easily in terms of developing this.  For example, setting up a main/subform combination with the main form being for the Office and a subform for the services offered will be a snap because the design is normalized.
  So it is going to be the opposite of what you're expecting with the queries; things will get easier to do.
  and it's great that your starting off with a clean slate on this.  You'll end up with a better app overall and will have something that is easy to maintain and expand down the road.
CALAOMSAuthor Commented:

Although I feel the fog shrouding my brain has been lifted, I just may take you up on your offer to come back to the question.  I'm heading out on vacation tomorrow and I am going to try to not to think about it until I come back, so it will be a week or two before I may have additional questions.
Thanks again
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Featured Post

The 14th Annual Expert Award Winners

The results are in! Meet the top members of our 2017 Expert Awards. Congratulations to all who qualified!

  • 8
  • 6
  • 4
Tackle projects and never again get stuck behind a technical roadblock.
Join Now