Solved

Data warehouse design, fact to dimention row count ratio

Posted on 2007-11-16
7
747 Views
Last Modified: 2013-11-16
We are working on a data warehouse project for one of our customers. We are now towards the end of the project and have real data going into the warehouse.  The database has 13 dimensions and a fact table with 62 attributes. Currently the fact table has 7 million records in it and most of the dimensions are less than 10k rows in size. However, 2 of the dimensions have 400k rows which is about 5.5% of the rows in the fact table i have been monitoring both tables for a month now and conclude that they are always around 5 or 6% of the fact table.

Is this an acceptable growth for a dimension table? I am worried that the growth of these tables will become an issue later down the line as the data in the system now is only a snippet whats to be loaded before production. The only solution would be to merge the 2 big dimensions into the fact table, but this would mean having some non numeric attributes in the call fact table.

Please advise.
0
Comment
Question by:andy_routledge
  • 2
  • 2
7 Comments
 
LVL 18

Accepted Solution

by:
PFrog earned 113 total points
ID: 20299496
I would say this are too few records to warrent inclusion into the fact table. About 5% for a slowly changing dimension doesn't seem too unreasonable to me.

The most important factor is to make sure it correctly maps your business structure. If you can explain the content of these dimensions and what is changing in them I could advise in more detail.
0
 
LVL 4

Author Comment

by:andy_routledge
ID: 20299679
Well the system is for a telecoms company, the fact granularity is at call level e.g. 1 call 1 record. Each call has 4 telephone numebrs assosiated with it, dailer number called number and 2 which relate to routing information most of the time these are the same. One of the dimensions tracks these individual numbers and the info that goes with them national international etc. The other large dimention is a "junk" dimention it contains rarely used information but due to the nature of the buisness all call information has to be recorded for at least 4 years. The junk dimention bascilly stores some technical info for each call which is the same for alot of calls, trunk route, ciricute, bearer etc.
0
 
LVL 12

Assisted Solution

by:RWrigley
RWrigley earned 112 total points
ID: 20303116
From a reporting standpoint, I'd be more more concerned about the ratio between the categories in the different levels in your dimensions, because you'll hit performance problems if you have to write reports that deal with large sets of data.  
0
 
LVL 4

Author Comment

by:andy_routledge
ID: 20311266
I'm happy with the first answer seems thanks guys.
0
 
LVL 18

Expert Comment

by:PFrog
ID: 20311272
It looks like your first large dimension is modelled correctly.
I'm not too sure about the junk dimension though. The way you described it, it sounds like it could/should be split up into a number of dimensions - is there a relationship beteeen trunk route, circuit bearer etc? If they are related then you've got it right already, if they're not related then they should be split up.
0

Featured Post

Netscaler Common Configuration How To guides

If you use NetScaler you will want to see these guides. The NetScaler How To Guides show administrators how to get NetScaler up and configured by providing instructions for common scenarios and some not so common ones.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Many companies are looking to get out of the datacenter business and to services like Microsoft Azure to provide Infrastructure as a Service (IaaS) solutions for legacy client server workloads, rather than continuing to make capital investments in h…
These days, all we hear about hacktivists took down so and so websites and retrieved thousands of user’s data. One of the techniques to get unauthorized access to database is by performing SQL injection. This article is quite lengthy which gives bas…
Video by: Steve
Using examples as well as descriptions, step through each of the common simple join types, explaining differences in syntax, differences in expected outputs and showing how the queries run along with the actual outputs based upon a simple set of dem…
Polish reports in Access so they look terrific. Take yourself to another level. Equations, Back Color, Alternate Back Color. Write easy VBA Code. Tighten space to use less pages. Launch report from a menu, considering criteria only when it is filled…

867 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

23 Experts available now in Live!

Get 1:1 Help Now