Link to home
Start Free TrialLog in
Avatar of pdvsa
pdvsaFlag for United States of America

asked on

Date Fields intable?

Experts,

I have a question. Is it better to have date fields in say tblProjects  or better to have a link to tblDates with a one to many?  

Example: (tblProjects)
fldBeginDate
fldEndDate
fldAwardDate
fldBidDueDate

==>you wouldnt want each of those fields in a table  as a separate field, correct?  It is too limiting and requires many separate fields in the tblProjects.  It is better to design a tblDates (ID,ProjID (fk), fldDateType) with a relationship to the tblProjects (on ProjID) and have fldDateType as a combo box with a row source qry and a NOT IN event (on the form) to add what a user types that is not in the list.

little difficult to explain without getting too wordy.  Let me know if this makes sense.  I am pretty sure I am correct on this but wanted some feedback from an expert.  

thanks...
SOLUTION
Avatar of Jim Dettman (EE MVE)
Jim Dettman (EE MVE)
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
ASKER CERTIFIED SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of pdvsa

ASKER

JDettman:
< In this case, can a project have more then one begin, end, award, or bid due date?
No

Let me add a bit more to my explanation.
There could be many more dates than what I posted.
Lets say there could be 15 different dates for a particular project.  
Then you would have to have 15 different fields.  

Is that good design?
If you are going to have a variable number of date types and dates, I would generally use the second table and use a subform to select and enter those values.

If you find that you need them as a single record, you could create a crosstab query using the date "types" as your headings.
Avatar of pdvsa

ASKER

fyed:  that makes sense to me.  

Can you explain just a tad bit more :
<If you find that you need them as a single record, you could create a crosstab query using the date "types" as your headings.

I have had experience with crosstab queries and do you mean to have the query as a row source on fldDateType (a combo box) on the second table?
<<
Let me add a bit more to my explanation.
There could be many more dates than what I posted.
Lets say there could be 15 different dates for a particular project.  
Then you would have to have 15 different fields.  
>>

OK.

<<
Is that good design?
>>

 Yes and it would be correct.  Despite the fact that they are all "dates" (which you are defining via a data type),  they are all not the same.

A "Bid Due Date"  is not the same thing as an "Award Date".  Those are two seperate facts about a project despite that they are both dates.

To put that another way, if I split out the dates into a seperate table like this:

tblProjectDates
DateID - Autonumber -  PK
ProjectID - Long - FK to tblProjects
DateValue - D/T

 and I would have no indication of the type of date it is.  So I'd be forced to do this:

tblDateTypes
DateTypeID - Autonumber -  PK
DateType - Text - CK

tblProjectDates
DateID - Autonumber - PK
ProjectID - Long - FK, CK1a
DateType - Long - FK, CK1b
DateValue - D/T

  CK1 would be a composite index on ProjectID and DateType because I would need to ensure that there was only one type of date for a given project.

  This is called an Entity Attribute Value design because what you are doing is defining the attributes (or the facts about something) through data entry rather then through table definition.  This leads to some fundamental problems and the design is generally frowned upon, although there are some rare cases when you might use it.

 If however any of your dates repeated, say milestones as mentioned:

MilestoneDate1
MilestoneDate2
MilestoneDate3

  Then that is an indication that you have a fact about something other then "a project" in your table.  In this case it is "a milestone".  A milestone is associated with a project, but a project and a milstone are not the same thing.  A milestone for example does not have the attribute "Award Date".

 That becomes a bit clearer when you start adding additional facts about a milestone, such as a description, who planned it, etc:

Milestone1Date
Milestone1Desc
Milestone1PlannedBy
Milestone2Date
Milestone2Desc
Milestone2PlannedBy
Milestone3Date
Milestone3Desc
Milestone3PlannedBy

  Now it's obvious that these don't belong in the project table and need to break them out.  Table structure would look like this:

tblProject
ProjectID - Autonumber - PK
ProjectDesc - Text
BidDueDate - D/T
AwardedDate - D/T
BeginDate - D/T
EndDate - D/T

tblProjectMilestones
MilestoneID - Autonumber - PK
ProjectID - Long - FK to tblProjects
MilestoneDate - D/T
MilestoneDesc - Text
PlannedBy - Long - FK to tblEmployees

HTH,
Jim.
What I meant by that is that if, at some point, you want to show a list of the projects and the dates associated with that project, with all the dates in a single row, you could create a crosstab query that would use the ProjectID as the RowHeader, the DateType as the column header, and the actual Date as the value.  For example, if my boss wanted an Excel spreadsheet with all the active projects, I'd create this crosstab query and join it to my projects table and then filter it.

It crosstab portion of that query might look like:

TRANSFORM First(tbl_ProjectDates.[Date]) AS FirstDate
SELECT tbl_ProjectDates.ProjectID
FROM tbl_ProjectDates
GROUP BY tbl_ProjectDates.ProjectID
PIVOT tbl_ProjectDates.DateType In ("BidDueDate","AwardDate","BeginDate", "EndDate");


If I were going to do this, I would probably use the ColumnHeadings property of the crosstab query to specify the order of the columns, otherwise, they would end up being in alphabetical order.  If you do this, they will show up in the query in the PIVOT ... IN ... clause of the query.  But you have to make sure you have all of the possible DateTypes in that clause because any DateType that is not in that list will not show up in your output.

BTW, I would not normally use the reserved word "Date" as a field name.
Avatar of pdvsa

ASKER

Jim: that helps but  little overload for my non expert brain capacity.  I think we are saying similar ideas:  better to break out in a separate table if you have many different dates.  One thing that you mentioned that I did not point out is to break out another table for DateTypes.  

I think this second table is describing Normalization?
CK is Composite Key?
<<I think we are saying similar ideas:  better to break out in a separate table if you have many different dates.>>

  No, you do not break out into a separate table.  You only break out into a separate table when you have facts about something else.  Your current table is about "a project".  If the dates pertain to "a project" then they belong in that table no matter how many there are.

'CK' is short for candidate key.  Everyone today uses meaningless keys (an auto number) for performance and that's the way I wrote it.  But if you were just modeling the data and using a natural key (derived from the data) it would look like this:

tblDateTypes
DateType - Text - PK

tblProjectDates
ProjectID - Long - FK, PKa
DateType - Text - FK, PKb
DateValue - D/T

Jim.
@pdvsa,

Actually, I think what Jim (I'm sure he will correct me if I'm wrong) was recommending is a combination of the two methods

Add fields for those things that are fixed (bid due date, award date, begin date, end date) to the main table.  Then add the milestones table for those other date types that you may or may not need for a particular project.
Avatar of Montoya
Montoya

as my colleagues have mentioned, and as I said above:

if you're describing your project, yes, all those dates should stay in the original table.
If you're going to 'break-out' additional information about those dates, then perhaps the design suggested by Jim is the way to go.
Avatar of pdvsa

ASKER

Jim:
< No, you do not break out into a separate table.  
bear with me for  sec... since you have a tblProjectDates as described isnt that breaking out the dates into their own table and eliminating the need for numerous separate fields in tblProjects?
<<bear with me for  sec... since you have a tblProjectDates as described isnt that breaking out the dates into their own table and eliminating the need for numerous separate fields in tblProjects?>>

  No problem...what I was showing you is an example of what you should not do. Numerous fields is not a problem.  If those fields describe something about what the table represents, then that's where they belong.

  Let's step back a bit.  When you form a table, all the non-key columns should be about whatever the table describes.  In this case, it's a project.

 Now because it's "a project", that means it has certain attributes or facts that we know about it.  One of those facts might be "Received Date", which is the date we received it in our office.

 With Received Date being part of tblProjects, I can look at any row of the table and Received Date does apply.  I may not know the value for it at the moment (it's NULL), but a project *always* has the attribute "Received Date" and it contains either a value or a null.

 Now if you look at the layout I did when breaking out the dates, a given project might not end up with a received date field (if I did not enter a record for it) even though the very definition of "a project" is that it always has one.

 So if we used a design like that, it would be possible for each and every row of tblProjects to have a different number of fields in regards to dates (just because I may or may not enter a record in tblProjectDates for a given Project/DateType combination).  This is not allowed in relational theory.  Every row in a table must have the same "shape" or number of fields.

 To show that better, I could take that to the extreme and extend it further so that it's just not used for dates:

tblProjects
ProjectID - Autonumber - PK

Data:
1
2
3

tblFieldTypes
FieldName - Text - PK
FieldDataType - Text - Date, Float, text, etc.

Data:
Description       Text
ReceivedDate   D/T
EstCost            Currency

tblProjectFieldData
ProjectID - Long - FK to tblProjects - PKa
FieldName - Text - PKb
FieldValue - Text

Data:

1     Description      Project1
1     ReceivedDate  01/02/12
1     EstCost            1234.56

(for project 2, I enter nothing)

3     EstCost            oops


 So you can see the problems; we have three projects in our projects table.  One has three fields of data (project 1), one nothing (project 2), and the last only has one (project 3).

 Also notice that I have a data typing problem - I'm stuffing a date/time value (for received date) into a text field.  I'm also stuffing a currency into a text field.  And because I'm doing that, I've lost the strict data type checking that the DBMS (Database Management System) provides.   Note that someone entered "oops" for project 3 for the EstCost when it should have been a numeric value.

Now say we add table tblEmployees:

tblEmployees
EmpID - Autonumber - PK

Data:

1

 is there anything stopping me from adding a field value for EstCost?

tblEmployeeFieldData
EmpID - Long - FK to tblEmployees - PKa
FieldName - Text - PKb
FieldValue - Text

Data:

1     EstCost            1234.56

  Nope.  So now I have a record in tblEmployees with a field called "EstCost" and a value and I have an attribute that doesn't have anything to do with being an "employee".

  This is why ENV (Entity Name/Value) designs are frowned upon.  They allow too many inconsistencies to occur.

 The only time you should be breaking out fields from a table is if they do not pertain to what the table describes.  The big tip-off for that is that the fields need to be repeated:

Milestone1Date
Milestone2Date
Milestone3Date


The other tip off is what if I can't enter the data?  Say I have a project with four Milestones.  Hum, I'd have to add a new field.  Your design of the table should not have to change in order to accomidate more data of something already recorded.

 A slightly better example of that; have some objects, say pens and a car.  I create a table like this:

tblObjects
ObjectID - Autonumber - PK
Description - Text
Make - Text
Model - Text
Year - long
WheelSize -
Ink Color

 Now try to enter a car in the above table.  Can a "car" have an ink color?   What if you tried entering a pen?  You could enter everything except for wheel size.  

 It's not simply a matter of not knowing the value for those fields, it's that I cannot ever have a value because it doesn't apply.

 So even though I have a rigid table definition (every row physically will have seven fields), when I look at the meaning of the data itself, which is what relational theroy is concerned with, I have two different rows that have a different "shape" (number of fields) then what I've defined for "an object".

 I hope those examples help!

 Sorry for being long winded, but it's difficult to explain the concepts in words without a lot of detail.

Jim.
Avatar of pdvsa

ASKER

Jim:  appreciate you hangin in for me.  I am far from expert like you.  

One other question and that is all I have.  

I have not really focused on the fact that the Date Field Names are not really hardcoded names...they are dynamic (date field names change for each project).  

This is a reason why I think that since the date field names are ever changing then it might be best for me to have a separate table with the NOT IN event on the fldDateName field that adds that name the user typed in to the table.  

Do you think that if the Date Field names are changing then it is best not to hard code field names in tblProjects?  I would have to add a field for each Date field name and would end up with a few fields.  

I hope I am making sense.  You might have already answered this...sorry if so.
I'd guess I'd have to ask why are they changing.  That's pretty unusual.  Give me an example.

Jim.
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of pdvsa

ASKER

Nick:  i had thought of "over normailization" at one time.  I thought i was silly but now i feel better about that.

Jim:  i will make it simple in that the company i work for loves dates and i dont think i have encountered all of the different date names they have for the many projects we have.  I now know if at least 15 different ones and i know there are many more.  If i knew ALL of them then i could say the names would be fixed but i am not at that pt now.

Thank you
Though it is very long, the discussion of null and normalization may be worth your while to read through, for the examples of what 'over-normalization' can look like and the logic that can lead to that.
https://www.experts-exchange.com/questions/27306297/Avoiding-Nulls.html

<i dont think i have encountered all of the different date names they have for the many projects we have.  I now know if at least 15 different ones and i know there are many more.>  

You are definitely looking at one table, and probably two.
A table to hold the dates and an FK back to the main table, and probably an FK to another table that holds DateDescription ID's and text--otherwise you'll wind up re-writing the text that describes what the date is for, over and over again.
I think if we dug into it, the need for varying dates would go away, which is why I asked for examples.

 When I've encountered these situations in the past, it's always a result of a process not being formalized (quantified and measured), so it gives the appearence of being variable when it's really not.

 An example of that:

Project A is a non-state job, so no inspections are required.  Project B however is, so we'll have an assigned inspector and each phase of the project will be inspected at various points.

 Some might look at "Assigned inspector" as being variable from one project to the next because in one it's required and one it is not.  But it's really not.

 Typically the need to have a varying data structure is just saying that you have not modeled the data completly.

Jim.
>> Typically the need to have a varying data structure is just saying that you have not modeled the data completely.

@Jim

Or you work for an organization that wants to automate a process but which cannot settle on what they want to call things.  I've been in my current position for four years, and we've gone through three different naming conventions for what they want to call a particular "Item" in our database.  I finally got to the point where I had to create an ItemCategories table which contains ItemCatID, ItemCat_FY, ItemCat_Name.  Then I added the ItemCatID next to the "Item" field, so I know what to call that Item.  It's ridiculous, but it works.
Yes, but when I'm talking about varying data structures, I not talking about the fact that a value can change, but that a fact can apply to one instace of something and not to another.

In your last example, every row in your item table has the field ItemCatID does it not?  Maybe not a value (it may be NULL), but the field is there for every row.   It has to be because you've defined it as being part of what a "item" is.

If you went to the ENV design I did earlier, you could find yourself in a situation where that is not true.

Jim.
Avatar of pdvsa

ASKER

OK thanks guys... I suppose a split of the points with possibly a little more going to Jim?  Not sure what is fair.  

If no response then I guess no one will object.  thank you... gotta get back to my real job damnit!
split down the line; everyone contributed.

Jim.
Avatar of pdvsa

ASKER

thank you for the expert help...