elucero
asked on
date of death fact tbl or dimension?
I need to report on date of death. I think it goes into a fact table since it's something that has happened. But what if the date of death is null? How would I populate it. I don't want a null in fact tbl. I really don't want to use a dummy date like 9999-12-31. Any suggestions?
ASKER
Yes, ok NULL does make sense, but won't I have problem with NULL value in a cube?
It would depend almost entirely on your database design.
If it's a traditional OLTP, date_of_death would probably just be a column it the table that describes people.
In a data warehouse, the structure of the fact table(s) usually resembles the OLTP's transaction table(s). Depending on the nature of the business the transactions may be split over multiple fact tables. An insurance company would probably have 2 fact tables in their D/W. One for policies/premiums and another for claims. The two tables would have almost nothing in common except the policy ID so separate tables has advantages.
Does your D/W record multiple events for people? Birth, graduation, marriage, etc.? If so, a death record in the fact table is appropriate.
If it's a traditional OLTP, date_of_death would probably just be a column it the table that describes people.
In a data warehouse, the structure of the fact table(s) usually resembles the OLTP's transaction table(s). Depending on the nature of the business the transactions may be split over multiple fact tables. An insurance company would probably have 2 fact tables in their D/W. One for policies/premiums and another for claims. The two tables would have almost nothing in common except the policy ID so separate tables has advantages.
Does your D/W record multiple events for people? Birth, graduation, marriage, etc.? If so, a death record in the fact table is appropriate.
ASKER
It's healthcare industry. If death occurred based on specific disease. I think it goes into fact tbl??
the date of death is a fact
NULL (or no row) is a fact too (i.e. not dead, or no death recorded)
death as a dimension would not be very helpful as it is a boolean condition (is dead/is not dead)
a time dimension (I assume you have one) would be relevant to the date of death
(e.g. to answer "how many deaths in fiscal year 2014")
NULL (or no row) is a fact too (i.e. not dead, or no death recorded)
death as a dimension would not be very helpful as it is a boolean condition (is dead/is not dead)
a time dimension (I assume you have one) would be relevant to the date of death
(e.g. to answer "how many deaths in fiscal year 2014")
I can envision a lot of scenarios and the possibility of several fact tables.
How may fact tables are there? How may of them record people specific events?
How may fact tables are there? How may of them record people specific events?
I can envision a lot of scenarios and the possibility of several fact tables.agreed!
ASKER
Thanks for the help. I have one fact so far with case number, bunch of dates including dt of death which will mapped up to datedim. I'm worried about putting the dt of death in fact because of how analysis services handles NULLS in a cube. I don't want it to default to a dummy value like '9999-12-31'.
If the patient survived, don't put a row into the fact table that marks the date of death. That would seem to solve the problem.
ASKER
I would have put the row in the fact if patient survived since other facts in fact table are caseID ,diaganosisdt, submitteddt, deathdt. So I still have the null issue. Weather they died or survived I would still have values for the rest - caseID ,diaganosisdt, submitteddt.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Thanks, so you're suggesting a death dim which would be like this
caseID
DateofDeath
Then I would do an outer join from factCases to deathdim for my schema fact.caseid = deathdim.caseid. I think this will work than I can avoid the null values. Why would add the null values in the deathdim? They only need to report deaths not survived.
caseID
DateofDeath
Then I would do an outer join from factCases to deathdim for my schema fact.caseid = deathdim.caseid. I think this will work than I can avoid the null values. Why would add the null values in the deathdim? They only need to report deaths not survived.
ASKER
I'm thinking now that this would be a good junk dimension since I have other yes/no attributes like case closed, died, hospitalized? So it would be like this
caseid
dateofdeath
death y/n
datehospitalized
hospitalized y/n
dateclosed
caseclosed y/n
caseid
dateofdeath
death y/n
datehospitalized
hospitalized y/n
dateclosed
caseclosed y/n
If there's exactly on row in the fact table with any given caseid value, this can work. If a case produce multiple rows any counts (or other aggregate function results) that are based on the dimension will be wrong.
ASKER
Yes, I think will work. I'm going to go w/junk dimension. Thanks for your help, Elizabeth
>I think it goes into a fact table since it's something that has happened.
Not necessarily, as fact tables are measurements, of which date of death would be one.
Unless you want DimDeath, but then your Fact would still join on that table...