Go Premium for a chance to win a PS4. Enter to Win

x
Solved

# Dimensional Model: How Can I Improve a Healthcare Call Centre Star Schema?

Posted on 2014-04-11
Medium Priority
1,012 Views
Last Modified: 2016-02-18

Hello experts!
The image shows part of a Dimensional Model design a colleague and I are working on. We are building a data warehouse and BI application using SQL Server and SSAS Tabular Model respectively.

## Intro

My colleague settled on the above design because:
CCG Population and GP Population contain numeric data that will be used to calculate “Number of calls per 1000 population”
CCG & GP Population tables contains numeric data used in a measures and should be considered as fact tables.
This method avoids a monster dimension

I feel that there is perhaps a better approach because:
Fact Call Log is a business process, the other two fact tables are not.
GP Population and CCG Population should become dimensional attributes within DIM GP Practice.
In terms of super-simple user interface, I wish to have a single dimension holding this data.

## The Challenge

DIM GP Practice is a dimension with many rows.

GP Practice Name, GP Population and CCG population all change independently of each other at different times and frequencies.

Consequently, we need to track 3 changing variables in a single dimension – which may lead to an explosion in the number of rows in DIM GP Practice.

We also need to keep track of history, so the measure uses population figures in 2012 when viewing 2012 data and population figures for 2014 when viewing 2014 data and so forth.

Any thoughts on how best to refine this dimensional model?

Priorities are very easy user interface i.e. end-users have the smallest number of dimensions that they slice and dice by and ease of development to ensure rapid updates and changes going forward with minimal messing around with DAX statements.

Looking forward to your feedback!

Cheers!
0
Question by:JohnAeris
• 6
• 4
10 Comments

Author Comment

ID: 39993676

## Definitions

Tables
Fact Call Log: Every call for a medical centre referral is a transaction.
DIM GP Practice:  a Slowly Changing Dimension with > 10,000 rows of Medical Centre data. Updated quarterly.
FACT CCG Population: Population is updated yearly. Sourced from another business unit.
FACT GP Population: Population is updated quarterly. Sourced from another business unit.

Terms
A CCG consists of many GPs in a specific geographic area. So GP > CCG is a natural hierarchy.
CCG population:  total population in the CCG area.
GP Population: number of registered patients a GP serves.
0

LVL 46

Expert Comment

ID: 39993979
Hi John,

Some things to start.  :)

A)  A data warehouse is an algebraic expression that is designed to answer a specific question.  It may answer numerous questions, but it's core is a single answer.
2)  Most data warehouses include a time dimension and utilize count() or sum() over time.
D)  A star schema is typically a fact table with multiple dimensions around it.  There can be multiple fact tables and the fact tables can "share" some/all of the dimension tables.  I don't see that shown in your model.

With that as backdrop, it does look like you're trying to design the database around answering some key questions.  :)

Your colleague's approach may well serve your stated needs, but it's not a true star schema.  GP Population and CCG Population are dimensions (as you suggest) and describe attributes of the calls.

It is the call data that should be the fact table.  When a new call comes in, it can be populated into the fact table as long as the dimension tables contain the items that describe the call.  (The time dimension will have to be pre-populated, any "location" dimension will have to have the city, state, zip, for the caller location, etc.)

“Number of calls per 1000 population” is then computed by joining one (or both) of the population dimensions and the time dimension to the fact table (calls).

I believe that your approach is superior to your colleague's, and is much closer to being a true star schema.

Good Luck,
Kent
0

Author Comment

ID: 39994029
Hi Kent!

Thanks for your feedback.

The current model includes 30 Dimensions (including Time of day and Date) around a central fact table called CALL LOG. I removed this to make the image clearer.

“multiple fact tables and the fact tables can "share" some/all of the dimension tables”.
Agreed.

My colleague appears to be using the Fact CCG Population and Fact GP Population fact tables as “helper/lookup” tables in order to create the measures. As stated, it works, but I feel this is not the best approach.

“It is the call data that should be the fact table” – perhaps I didn’t make this clear, the Call Log Fact Table holds all the call data.

“I believe that your approach is superior to your colleague's, and is much closer to being a true star schema.” Thank you, not quite there yet though. :)

I still require assistance to create an alternative model that we can agree upon.

The key challenge:
CCG Population and GP population are updated at different times and quite frequently.
A CCG is made up of many GPs and this creates a natural hierarchy and so my initial thought is to incorporate all of GP, CCG and all populations in a single dimension.
However, if I do this, then we will end up with a dimension that changes 3 variables every month or so so it will grow exponentially.

Will it make more sense to separate that GP and CCG populations into a separate dimension even though they relate directly to GP and CCGs in the GP Practice dimension?

Any thoughts?
0

LVL 46

Expert Comment

ID: 39994061
Can you describe the CCG Population and GP population tables a bit more?  What data is contained in them, what is their relationship to other tables?  What is the link between them?  Etc...
0

Author Comment

ID: 39994350

## GP Population Output

Update Frequency: Monthly

NB > 10,000 rows of data, for about 500 GPs.

## CCG Population output

Update Frequency: Yearly

## DIM GP Practice output

Update Frequency: Weekly

GP_Practice Code joins to GP_Practice Code in FACT GP Population.
GP_Practice_CCG_Code joins to CCG_Code in FACT CCG Population.

## More Info

At present, FACT GP Population and FACT CCG Population are not displayed to users and are merely used as “helper / lookup tables” for population calculations, however, it breaks the star schema construct.
Also, Business Users would like to use CCG population as a report header / label. As it is not a dimension, this is not possible.
At present, we are using SSAS Tabular DAX to do the calculations. So it works, but I am looking for a better way to complete the dimensional modelling.
0

LVL 46

Assisted Solution

Kent Olsen earned 1500 total points
ID: 39994504
It looks like the GP Population and CCG Population tables are summary tables of other data.  Does the data derive from the CALL fact table or some place else?

--    At present, FACT GP Population and FACT CCG Population are not displayed to users and are merely used as “helper / lookup tables” for population calculations, however, it breaks the star schema construct.

From what you've described, this may be fine.  From my vantage point it looks like you may have multiple stars with a common time dimension.  (I don't know how anything else might relate.)  But I'm still struggling with calling the * Population tables fact tables.  They look like summaries of fact tables, which could make them fact tables or dimension tables, depending on their usage.

--    Also, Business Users would like to use CCG population as a report header / label. As it is not a dimension, this is not possible.

What has to happen to treat the CCG Population table as a dimension?  Does it join to the CALL fact table on anything other than time?
0

Author Comment

ID: 39994677
Hi Kent,

It looks like the GP Population and CCG Population tables are summary tables of other data.  Does the data derive from the CALL fact table or some place else?

Good call. GP Population and CCG Population are populated by the Central Data Warehouse Teams for other business units. We also require the same data, so we are using these tables.

What has to happen to treat the CCG Population table as a dimension?  Does it join to the CALL fact table on anything other than time?

FACT CCG Population joins to DIM GP Practice table via GP_Practice_CCG_Code.
0

Author Comment

ID: 39994678
Perhaps, I should look at an Outrigger or mini-dimension approach? Hmmm...
0

LVL 46

Accepted Solution

Kent Olsen earned 1500 total points
ID: 39994799
There is probably a 1-to-many or even a many-to-many relationship between FACT CCG Population and DIM GP Practice.

Several options.  The big 3 would seem to be:

- You could build the FACT CCG Population data into the DIM GP Practice table, but that could explode the number of rows in the dimension.
- Could you create a DIM CCG Population table and link it to the FACT Calls table?
- You could leave it as is and treat it as an auxiliary table.
0

Author Closing Comment

ID: 40077937
Decided to stick with the original layout. This looks like a many-to-many issue, so, we're sticking with a TYPE 4 mini-dimension until we can figure out a more effect means.

THANKS!!!
0

## Featured Post

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Windocks is an independent port of Docker's open source to Windows.   This article introduces the use of SQL Server in containers, with integrated support of SQL Server database cloning.
In this blog post, we’ll look at how using thread_statistics can cause high memory usage.
Polish reports in Access so they look terrific. Take yourself to another level. Equations, Back Color, Alternate Back Color. Write easy VBA Code. Tighten space to use less pages. Launch report from a menu, considering criteria only when it is filled…
In this video, Percona Director of Solution Engineering Jon Tobin discusses the function and features of Percona Server for MongoDB. How Percona can help Percona can help you determine if Percona Server for MongoDB is the right solution for …
###### Suggested Courses
Course of the Month8 days, 23 hours left to enroll

#### 877 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.