Link to home
Start Free TrialLog in
Avatar of Mark01
Mark01Flag for United States of America

asked on

Table Design

I am designing an e-mail tracking database. I’m working on normalizing the tables. Please comment on the proposed design. E-mail image file notes must be tracked. For example, notes about the c:\email\file1.pdf file must be tracked. The note may read: “File contents: 2013 tax preparation.” or “p. 11 from Mary Smith, CPA”

There will be level 1 and level 2 subjects and level 1 and level 2 topics. A example of level 1 and level 2 subjects and level 1 and level 2 topics is the following:
Two levels of subjects:
Taxes
Income tax preparation

Two levels of topics:
California CPAs
Los Angeles CPAs


Proposed Tables

Comments
CommentID
Comment

Files
FileID

File_ Comments
CommentID
FileID
PathID
Topic_Level1_ID
Topic_Level2_ID
Page

Paths
PathID
Path

Subject_Level1
Subject_Level1_ID
Subject

Subject_Level2
Subject_Level2_ID
Subject

Topic_Level1
Topic_Level1_ID
Topic
Subject_Level1_ID

Topic_Level2
Topic_Level2_ID
Topic
Subject_Level2_ID
Avatar of Jim Dettman (EE MVE)
Jim Dettman (EE MVE)
Flag of United States of America image

Before anyone can really comment on the design, we need a little more detail on what it is your trying to accomplish.

Basic e-mail tracking yes, but where do the CPA's and Level of topic fit in?

It would be best if you just briefly described the overall situation and what type of information you plan to get out of this.

Jim.
SOLUTION
Avatar of Brian Crowe
Brian Crowe
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of Mark01

ASKER

Jim:
I am just trying to track the emails that are contained in PDFs and make comments about them. Numerous emails are printed to a single PDF file. Each PDF contains numerous emails. Each PDF will have multiple subjects and topics associated it. In the example, the  c:\email\file1.pdf file can have two subjects and two topics associated with it. The subjects are Taxes and Income tax preparation. Taxes is the parent subject and Income tax preparation child subject. The same with the parent and child topics:  California CPAs and Los Angeles CPAs .

Here is a sample report:

c:\email\file1.pdf
Subj: Taxes
Topic: California CPAs
Level 2 Subj: Income tax preparation
Level 2 topic: Los Angeles CPAs
Comments:
File contents: 2013 tax preparation.
p. 11: From Mary Smith, CPA

Brian Crowe:
That is a good point. No, there is no reason set a limit of two subjects/topics. Please provide an example of how to create a hierarchical table out of the subjects
Avatar of Mark01

ASKER

The Paths table tracks the path to the PDF. In the example above, the  path is: c:\email.
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of Mark01

ASKER

Here are the answers to the questions.

Q: Can a child topic belong to more than one parent?
A: Yes

Q: Can there be more than two levels?
A: Yes
You need to realign this process.  Your first concern should only be the data itself.  Thus, you must start with logical design: i.e., "entities" and "attributes" rather than "tables" and "columns".  This is more than just semantics.  There are literally hundreds+ books about logical design; many thousands if you include articles.  

Later you can turn the logical design into a physical design.  That's the easy part.

You need to describe as many facts about the data as possible, then use that info to model the data.  "Use cases" are also very popular now, so use those if they will assist you.  You will have separate definition and relationship data -- this is critical!

For example (D) mean that entry is a definition, otherwise assume it's a relationship.  {} indicates suggested areas for consideration, and are thus optional:

we must track all(?) emails received
(D) an email is a standard email {or a fax? or a phone call? do other things need tracked??}  if so, "email" might become "communication", for example.
we must record the physical location of emails
an email may have none(?)-to-many notes
(D) a note may be text: “File contents: 2013 tax preparation.” or “p. 11 from Mary Smith, CPA” or {it may be a file or other attachment?}
email notes must all(?) be tracked
an email must have a subject, but there could also be a subject hierarchy, of indeterminate depth
(
we will pre-determine a list of subject hierarchies about which we want to track data
OR
we will track all subject hierarchies, adding new ones when received
)
...
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of Mark01

ASKER

Jim, your explanation makes sense. Regarding the comments, I will just add add a "Comments" field to Files table.
Mark,

 What Scott is bringing up is the fact that relational data models are about the meaning of the data, which is why I asked you to explain what the process was and what you were trying to accomplish.

  At the end of our little exercise, I was going to have you walk through plugging data into the "tables" and see if that worked for real world examples (Scott's  "use cases") and gave you what you needed.  The second part of that is to  then ask questions; "what if I want to see all the topics associated with a given e-mail address?"    Can you do that with the model at hand?   In other words, is all the output you want achievable.

  Often when your starting out, it is easier to walk through/learn like this rather than try and visualize it all up front, which you can do more easily once you've gained experience.

Jim.
ASKER CERTIFIED SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
<<Jim, your explanation makes sense. Regarding the comments, I will just add add a "Comments" field to Files table. >>

 Well besides the question of "Does a given comment apply to more than one file", there is also the question "Will a given file have more than one comment?".

 If so, or there are other things you want to track about a comment, like who made it and the date it was made, then it will need a separate table.

But if it's more link a description of the file, which will stand on it's own, then a field will suffice.   What you don't want to do in a free form field is:

08/20 - Jim made this comment
08/21/14 - Mark made another comment

 Because what you have there is actually three pieces of data:  A date, a name, and a comment.   With it in a single field, there's no way to parse it or break it down.   Someone might also do this:

08/10 - another comment

 and omit the name, which you might consider to be required.  

  What were back sliding into here are the rules used for normalizing data, which I'll sum up for you if you like.  But if I'm going into too much detail for you, holler.

Jim.
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of Mark01

ASKER

Jim and Scott, I agree with your comments about the logical design: i.e., "entities" and "attributes. I am using the following book to learn about database design:

http://www.amazon.com/Database-Design-Mere-Mortals-Hands-/dp/B00DIKUOUA/ref=sr_1_11?s=books&ie=UTF8&qid=1408736146&sr=1-11&keywords=database+design+for+mere+mortals

The problem I have is that I have to quickly create a temporary database while I work on the logical design for the email tracking database. This is just a temporary small database to track the emails that are contained in PDFs and make comments about them. it is important to track the path for each PDF.

I would appreciate it if someone would comment on my proposed linking tables.
OK, I give up.  I still believe an hour or two of reviewing data will save you vast time and trouble down the road, but I understand you're in a bigger hurry.  Good luck.  I'm sure Jim can help you properly.
Avatar of Mark01

ASKER

I have redefined and minimized the requirements. My goal is to just create a temporary small database to track the emails that are contained in PDFs and make comments about them. Numerous emails are printed to a single PDF file. Each PDF contains numerous emails.  This will be only one comment per file. This will be a single user Access database. It is important to track the path for each PDF.

Here is a sample report:
c:\email\file1.pdf
Comments: File contents: 2013 tax preparation.

Here are the proposed entities:
Files
Path

Here are the proposed attributes:
Comments
File name
Avatar of Mark01

ASKER

Thank you, Brian Crowe, Jim Dettman and ScottPletcher.
I'm a bit surprised you've close this out.   I also see you've asked multiple questions on this now.

 What your going to find is that you will get to an answer quicker if you stick with the original question.  Also, if you do decide for some reason to open another question, it's best to point back to the prior so Experts can understand the background.

 It's a waste of your time and theirs, because you'll end up going right back over what's already been done most times.

Jim.