Solved

Data Modeling

Posted on 2014-09-05
5
389 Views
Last Modified: 2014-09-11
This question involves the logical design of a database (entities and attributes). I would like for you to confirm the design.

An ERD and report is attached.

This question is different from my last question with regard to the proposed subjects and topics entities. Sample reports are shown below. The attached excel file contains the sample reports.

I am designing a database that will be used to track the emails that are contained in PDFs. Numerous emails are printed to a single PDF file. Each PDF contains numerous emails. The PDFs are not going to be stored in a database. Only the file path and file name will be tracked. This will be a single user database.

Sent e-mails. The following will be tracked:
1) The date and time the e-mails were sent.
2) The sender’s first and last name.
3) The text of the message body will be copied into a field.

E-mail Images. The following will be tracked:
1) An image of every e-mail. A multipage PDF will contain an image of every e-mail.
2) The e-mail image file path.

Subjects and topics can be associated with one or more emails. The sample reports shown below gives examples of subjects and topics.

PROPOSED ENTITIES

Entity: Child Subjects
Description: The lower level subject of a top level subject.
Rules:
A particular email may have two or more lower level subjects.
Rules:
The text of the message body will be copied into a field.
Attributes:
Child Subject ID --An arbitrary unique sequential number assigned to every child subject.
Child Subject--The lower level subject of the top level subject of a particular email.
Parent Subject ID --Identifier that specifies the parent subject.

Entity: Child Topic
Description: The lower level topic of a top level topic A child topic provides more detail about a top level topic.
Attributes:
Child Topic ID --An arbitrary unique sequential number assigned to every child topic.
Child Topic--The lower level topic of the top level topic of a particular email.
Parent Topic ID --Identifier that specifies the parent topic.

Entity: Email
Description: The particular email message that is being tracked.
Rules:
The text of the message body will be copied into a field.
Attributes:
Email ID --An arbitrary unique sequential number assigned to every email. arbitrary unique sequential number assigned to every email.
Date sent --The date on which the email was sent.
Time sent --The time that the email was sent.
Body Text --The text contained in the email message.
SenderID --Identifier that specifies the individual who sent the file..

Entity: Email Senders
Description: The individual who sent the email.
Attributes:
Sender ID --An arbitrary unique sequential number assigned to the individual who sent the file.
First Name --The first name of the individual who sent the file.
Last Name --The first name of the individual who sent the file.
Email Address -- The sender’s email address.

Entity: Files
Description: A standard-format PDF file used to hold emails.
Rules:
Only PDFs will be used. No Excel, Word, etc. files will be used.
Numerous emails are printed to a single PDF file.
All emails in a single PDF.
PDFs are currently not stored in a database; they are stored as standard PDF files.
Files may have zero to many associated comments.
Attributes:
File Id --arbitrary unique sequential number assigned to every file name. File names alone may not be unique, since the same file name could be stored in different paths.
File Name --The name of a standard-format PDF file used to hold emails.
Path Id --identifier that specifies the drive and path to the file.
File Type Id --Physical file type; typically this will correlate to the file extension, but that’s not actually required. For example, a ‘[.]txt’ file could actually [.]csv-format data and vice-versa.

Entity: File Types
Description: Physical file type; typically this will correlate to the file extension, but that’s not actually required. For example, a ‘[.]txt’ file could actually [.]csvformat
data and vice-versa.
Rules:
The File Type will probably be limited to PDF and .jpg.
99% of the time only PDF will be used.
There is almost zero chance that .DOC and a .DOCX will be used.
File Type ID --Arbitrary unique sequential number assigned to each unique File Type
Default Extension --default extension for this type of file: ‘PDF’ for pdf files.
File Type --Standardized description of the the file type: ‘PDF’.

Entity: Paths
Description: The path of a file. Example: c:\email\. This does NOT include the file name, which is part of the file data.
Attributes:
Path Id --arbitrary unique sequential number assigned to each unique path.
Path --the remainder of the physical path to the file
Drive --The “drive” name; may be a single letter, a share, a volume mount point, etc.

Entity: Parent Subject
Description: The top level subject that can pertain to more or more emails.
Rules:
An email can have only one parent subject or both a parent subject and an child subject.
A parent subject or both a parent subject and an child subject can be associated with two or more emails.
Attributes:
Parent Subject ID --An arbitrary unique sequential number assigned to every parent subject.
Parent Subject --The top level subject of a particular email.

Entity: Parent Topic
Description: The top level topic that can pertain to more or more emails.
Rules:
An email can have only one parent subject and no parent topic.
An email can have both a parent subject and a parent topic.
A parent topic or both a parent topic and an child topic can be associated with two or more emails.
Attributes:
Parent Topic ID --An arbitrary unique sequential number assigned to every parent topic.
Parent Topic --The top level topic of a particular email. Provides additional detail about the top
level subject.


Here are the sample reports:

Sample Reports (Parent Subject, Child Subject, Parent Topic and Child Topic.)

Sample Reports showing the association of four emails with a parent subject, child subject, parent topic and child topic.

The Parent Subject, Child Subject, Parent Topic and Child Topic tables shown below are filled with sample data.

Parent Subject
Missing Files
Payroll
Trustee Reports


Child Subject
Errors
Quickbooks
T Rowe Price Acct # 4972


Parent Topic
Co-Trustee False Statement
Co-Trustee Tanya accuses Jesse of file theft
Mary states totals are wrong
Tanya accuses Fred of incompetence


Child Topic
Disbursement mismatch
Files from Jesse's desk are missing
Mary's frequent threats
Suspected employee paycheck overpayment



Scenario ( 1)
Subject: Payroll
Child Subject: Quickbooks

Topic: Co-trustee False Statement
Child Topic: Mary's frequent threats

Sender: Julie Jones
Date: 1-1-14
Time: 4:00 p.m.
File: c:\email\file1.pdf
Email Contents: .. text from e-mail...



Scenario (2)
Subject: Missing Files
Child Subject: T Rowe Price Acct # 4972

Topic: Co-Trustee Tanya accuses Jesse of file theft
Child Topic: Files from Jesse's desk are missing

Sender: Tanya Jones
Date: 1-3-14
Time: 4:14 p.m.
File: c:\email\file1.pdf
Email Contents: .. text from e-mail...



Scenario (3)
Subject: Payroll
Child Subject: Quickbooks

Topic: Tanya accuses Fred of incompetence
Child Topic: Suspected employee paycheck overpayment

Sender: Tanya  Jones
Date: 1-9-14
Time: 3:04 p.m.
File: c:\email\file1.pdf
Email Contents: .. text from e-mail...



Scenario (4)
Subject: Trustee Reports
Child Subject: Errors

Topic: Mary states totals are wrong
Child Topic: Disbursement mismatch


Sender: Jesse Pavin
Date: 1-10-14
Time: 1:20 p.m.
File: c:\email\file1.pdf
Email Contents: .. text from e-mail...

**********************************************************
Sample Report (Parent Subject Only)

A parent subject or both a parent subject and an child subject can be associated with two or more emails. The following sample report shows two emails associated with only the parent subject “Missing Files.”

Parent Subject: Missing Files

Sender: Jesse Pavin
Date: 1-10-14
Time: 1:20 p.m.

Sender: Tanya  Jones
Date: 1-9-14
Time: 3:04 p.m.

Here is the ERD
ERDSampleData.xls
report.pdf
0
Comment
Question by:Mark01
  • 3
  • 2
5 Comments
 
LVL 69

Accepted Solution

by:
ScottPletcher earned 500 total points
ID: 40306750
Excellent overall.  You've clearly done some work on this.

1) You need to be as explicit as possible in rules, with specific requirements and cardinality (zero or one, one or more, etc.).

A couple of examples will give you the idea:
"
Entity: Child Subjects [CS]
 Description: The lower level subject of a top level subject.
 Rules:
 A particular email may have two or more lower level subjects.
"
An email can't have *one* CS?
Maybe instead:
Rules:
  A given email may have zero, one or more child subjects.

"
Entity: Parent Topic
 Description: The top level topic that can pertain to more or more emails.
 Rules:
 An email can have only one parent subject and no parent topic.
 An email can have both a parent subject and a parent topic.
 A parent topic or both a parent topic and an child topic can be associated with two or more emails.
"
 Rules:
     An email must have a parent subject.
     An email may have zero or one parent topics.
     The same parent subject and/or parent topic can be associated with one or more emails.



2) You need to flesh out "email", and related entities, much more fully.

Based on the rules:

Email should also contain:
    Parent Subject ID  /*required*/
    Parent Topic ID /*optional*/

A new "Email Child Subject" should contain, repeated as necessary for each email:
    Email ID
    Child Subject ID /*required*/
    Child Topic ID /*optional*/
0
 

Author Comment

by:Mark01
ID: 40307008
>>A new "Email Child Subject" should contain, repeated as necessary for each email:<<

Subjects and topics will be shared by email, future documents and future letters tables. For example, in reports, the Child Topic “Mary's frequent threats” may reference four e-mails, two documents and one letter.

Here is more detail:
Child Topic: Mary's frequent threats

Email
Sender: Julie Jones
Date: 1-1-14
Time: 4:00 p.m.
File: c:\email\file1.pdf
Email Contents: .. text from e-mail...

Letter
Writer: Joe Smith
Date: 3-9-14
Contents: .. text from letter...

Document (type of document)
Writer: Jesse Pavin
Date: 6-15-14
Contents: .. text from document
0
 

Author Comment

by:Mark01
ID: 40308930
Here are the four new entities relating to the email entity:

Entity: Email Child Subjects
Description: ?.
Attributes:
Child Subject ID --/*required*/ Identifier that specifies the child subject.
Email ID -- Identifier that specifies the given email. /*required*/
Child Topic ID -- /*optional*/ Identifier that specifies the child topic.

Entity: Email Parent Subjects
Description: ?.
Attributes:
Parent Subject ID --/*required*/ Identifier that specifies the parent subject.
Email ID -- Identifier that specifies the given email. /*required*/
Parent Topic ID -- /*optional*/ Identifier that specifies the parent topic.

Entity: Email Parent Topics
Description: ?.
Attributes:
Parent Topic ID --/*required*/ Identifier that specifies the parent topic.
Email ID -- /*required*/ Identifier that specifies the given email.

Entity: Email Child Topics
Description: ?.
Attributes:
Child Topic ID -- /*required*/ Identifier that specifies the Child topic.
Email ID -- /*required*/ Identifier that specifies the given email.


Here are all of the proposed entities:

Entity: Child Subjects
Description: The lower level subject of a top level subject.
Rules:
A given email may have two or more lower level subjects.
An email cannot have only one child subject.
A given email may have zero, one or more child subjects.
Attributes:
Child Subject ID --An arbitrary unique sequential number assigned to every child subject.
Child Subject--The lower level subject of the top level subject of a particular email.
Parent Subject ID --Identifier that specifies the parent subject.

Entity: Child Topics
Description: The lower level topic of a top level topic A child topic provides more detail about a top level topic.
Attributes:
Child Topic ID --An arbitrary unique sequential number assigned to every child topic.
Child Topic--The lower level topic of the top level topic of a particular email.
Parent Topic ID --Identifier that specifies the parent topic.

Entity: Email
Description: The particular email message that is being tracked.
Rules:
The text of the message body will be copied into a field.
Attributes:
Email ID --An arbitrary unique sequential number assigned to every email. arbitrary unique sequential number assigned to every email.
Date sent --The date on which the email was sent.
Time sent --The time that the email was sent.
Body Text --The text contained in the email message.
SenderID --Identifier that specifies the individual who sent the file.
Parent Subject ID -- Identifier that specifies the parent subject. /*required*/
Parent Topic ID -- Identifier that specifies the parent topic. /*optional*/

Entity: Email Child Subjects
Description: ?.
Attributes:
Child Subject ID --/*required*/ Identifier that specifies the child subject.
Email ID -- Identifier that specifies the given email. /*required*/
Child Topic ID -- /*optional*/ Identifier that specifies the child topic.

Entity: Email Parent Subjects
Description: ?.
Attributes:
Parent Subject ID --/*required*/ Identifier that specifies the parent subject.
Email ID -- Identifier that specifies the given email. /*required*/
Parent Topic ID -- /*optional*/ Identifier that specifies the parent topic.

Entity: Email Parent Topics
Description: ?.
Attributes:
Parent Topic ID --/*required*/ Identifier that specifies the parent topic.
Email ID -- /*required*/ Identifier that specifies the given email.
Entity: Email Child Topics
Description: ?.
Attributes:
Child Topic ID -- /*required*/ Identifier that specifies the Child topic.
Email ID -- /*required*/ Identifier that specifies the given email.

Entity: Email Senders
Description: The individual who sent the email.
Attributes:
Sender ID --An arbitrary unique sequential number assigned to the individual who sent the file.
First Name --The first name of the individual who sent the file.
Last Name --The first name of the individual who sent the file.
Email Address -- The sender’s email address.

Entity: Files
Description: A standard-format PDF file used to hold emails.
Rules:
Only PDFs will be used. No Excel, Word, etc. files will be used.
Numerous emails are printed to a single PDF file.
All emails in a single PDF.
PDFs are currently not stored in a database; they are stored as standard PDF files.
Files may have zero to many associated comments.
Attributes:
File Id --arbitrary unique sequential number assigned to every file name. File names alone may not be unique, since the same file name could be stored in different paths.
File Name --The name of a standard-format PDF file used to hold emails.
Path Id --identifier that specifies the drive and path to the file.
File Type Id --Physical file type; typically this will correlate to the file extension, but that’s not actually required. For example, a ‘[.]txt’ file could actually [.]csv-format data and vice-versa.

Entity: File Types
Description: Physical file type; typically this will correlate to the file extension, but that’s not actually required. For example, a ‘[.]txt’ file could actually [.]csvformat
data and vice-versa.
Rules:
The File Type will probably be limited to PDF and .jpg.
99% of the time only PDF will be used.
There is almost zero chance that .DOC and a .DOCX will be used.
File Type ID --Arbitrary unique sequential number assigned to each unique File Type
Default Extension --default extension for this type of file: ‘PDF’ for pdf files.
File Type --Standardized description of the the file type: ‘PDF’.

Entity: Paths
Description: The path of a file. Example: c:\email\. This does NOT include the file name, which is part of the file data.
Attributes:
Path Id --arbitrary unique sequential number assigned to each unique path.
Path --the remainder of the physical path to the file
Drive --The “drive” name; may be a single letter, a share, a volume mount point, etc.

Entity: Parent Subject
Description: The top level subject that can pertain to more or more emails.
Rules:
An email must have a parent subject.
An email can have both a parent subject and an child subject.
A parent subject or both a parent subject and an child subject can be associated with two or more emails.
Subjects will be shared by email, future documents and future letters tables.
Attributes:
Parent Subject ID --An arbitrary unique sequential number assigned to every parent subject.
Parent Subject --The top level subject of a particular email.

Entity: Parent Topic
Description: The top level topic that can pertain to more or more emails.
Rules:
An email must have a parent subject.
An email may have zero or one parent topics.
The same parent subject and/or parent topic can be associated with one or more emails.
Topics will be shared by email, future documents and future letters tables.
Attributes:
Parent Topic ID --An arbitrary unique sequential number assigned to every parent topic.
Parent Topic --The top level topic of a particular email. Provides additional detail about the top
level subject.



The revised ERD is shown below and the revised report is attached.
ERDreport.pdf
0
 

Author Closing Comment

by:Mark01
ID: 40317663
Thank you, ScottPletcher.
0
 
LVL 69

Expert Comment

by:ScottPletcher
ID: 40318260
No problem.

At some point, you just accept what you have and move quickly to the physical design side.  You'll discover a few more things there that you'll implement into the design.
0

Featured Post

How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

Join & Write a Comment

Checking the Alert Log in AWS RDS Oracle can be a pain through their user interface.  I made a script to download the Alert Log, look for errors, and email me the trace files.  In this article I'll describe what I did and share my script.
International Data Corporation (IDC) prognosticates that before the current the year gets over disbursing on IT framework products to be sent in cloud environs will be $37.1B.
Via a live example, show how to restore a database from backup after a simulated disk failure using RMAN.
This video shows how to configure and send email from and Oracle database using both UTL_SMTP and UTL_MAIL, as well as comparing UTL_SMTP to a manual SMTP conversation with a mail server.

706 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

17 Experts available now in Live!

Get 1:1 Help Now