Data Warehousing : star vs snowflake and denormalisation.
Posted on 2010-11-29
I've just started building our Data Warehouse schema.
One of the Facts I'm handling is number and value of sales (nothing too original there!).
I have multiple databases which are the same structure, but due to limitations in the original shrink wrapped app, each DB is a different currency/region.
I intend to bring all the data into 1 warehouse and handle base and foreign currency values. That's fairly easy to do.
My question relates to the structure of the schema for the following elements.
Database : This is nothing more than ID and the name of the SQL database which is really the country (UK, Holland, Ireland, Germany, India, Spain, etc.).
Customer : A customer is normally only in 1 database, but if they are in more than 1 database, they are not considered to be the same entity.
Customer Period : This is the customer's year/period/week. We hold these so that we can produce financial reports which match their operation.
I intend to use surrogate keys for all the dimension tables.
Based upon the above elements, should I use 1 dimension table with all the elements in? Or should I use multiple dimension tables.
I'm stuck trying to NOT normalize the data (Database, Customer, CustomerPeriod).
The fact table would link to a single Customer Period which would in turn link to the Customer and then to the Database.
We will have a LOT of entries for the Customer Period table.
The data warehouse is going to be used by SSAS/OLAP. Something I also know little about, but on page 5 of the Microsoft SQL Server 2005 Analysis Services book it says "... data warehousing is beyond the scope of this book ..." (more or less).
So, pointers, suggestions, etc. all gratefully received.