Solved

Data warehouse design, fact to dimention row count ratio

Posted on 2007-11-16
7
732 Views
Last Modified: 2013-11-16
We are working on a data warehouse project for one of our customers. We are now towards the end of the project and have real data going into the warehouse.  The database has 13 dimensions and a fact table with 62 attributes. Currently the fact table has 7 million records in it and most of the dimensions are less than 10k rows in size. However, 2 of the dimensions have 400k rows which is about 5.5% of the rows in the fact table i have been monitoring both tables for a month now and conclude that they are always around 5 or 6% of the fact table.

Is this an acceptable growth for a dimension table? I am worried that the growth of these tables will become an issue later down the line as the data in the system now is only a snippet whats to be loaded before production. The only solution would be to merge the 2 big dimensions into the fact table, but this would mean having some non numeric attributes in the call fact table.

Please advise.
0
Comment
Question by:andy_routledge
  • 2
  • 2
7 Comments
 
LVL 18

Accepted Solution

by:
PFrog earned 113 total points
Comment Utility
I would say this are too few records to warrent inclusion into the fact table. About 5% for a slowly changing dimension doesn't seem too unreasonable to me.

The most important factor is to make sure it correctly maps your business structure. If you can explain the content of these dimensions and what is changing in them I could advise in more detail.
0
 
LVL 4

Author Comment

by:andy_routledge
Comment Utility
Well the system is for a telecoms company, the fact granularity is at call level e.g. 1 call 1 record. Each call has 4 telephone numebrs assosiated with it, dailer number called number and 2 which relate to routing information most of the time these are the same. One of the dimensions tracks these individual numbers and the info that goes with them national international etc. The other large dimention is a "junk" dimention it contains rarely used information but due to the nature of the buisness all call information has to be recorded for at least 4 years. The junk dimention bascilly stores some technical info for each call which is the same for alot of calls, trunk route, ciricute, bearer etc.
0
 
LVL 12

Assisted Solution

by:RWrigley
RWrigley earned 112 total points
Comment Utility
From a reporting standpoint, I'd be more more concerned about the ratio between the categories in the different levels in your dimensions, because you'll hit performance problems if you have to write reports that deal with large sets of data.  
0
 
LVL 4

Author Comment

by:andy_routledge
Comment Utility
I'm happy with the first answer seems thanks guys.
0
 
LVL 18

Expert Comment

by:PFrog
Comment Utility
It looks like your first large dimension is modelled correctly.
I'm not too sure about the junk dimension though. The way you described it, it sounds like it could/should be split up into a number of dimensions - is there a relationship beteeen trunk route, circuit bearer etc? If they are related then you've got it right already, if they're not related then they should be split up.
0

Featured Post

Why You Should Analyze Threat Actor TTPs

After years of analyzing threat actor behavior, it’s become clear that at any given time there are specific tactics, techniques, and procedures (TTPs) that are particularly prevalent. By analyzing and understanding these TTPs, you can dramatically enhance your security program.

Join & Write a Comment

Does the idea of dealing with bits scare or confuse you? Does it seem like a waste of time in an age where we all have terabytes of storage? If so, you're missing out on one of the core tools in every professional programmer's toolbox. Learn how to …
SQL Command Tool comes with APEX under SQL Workshop. It helps us to make changes on the database directly using a graphical user interface. This helps us writing any SQL/ PLSQL queries and execute it on the database and we can create any database ob…
Video by: Steve
Using examples as well as descriptions, step through each of the common simple join types, explaining differences in syntax, differences in expected outputs and showing how the queries run along with the actual outputs based upon a simple set of dem…
Polish reports in Access so they look terrific. Take yourself to another level. Equations, Back Color, Alternate Back Color. Write easy VBA Code. Tighten space to use less pages. Launch report from a menu, considering criteria only when it is filled…

743 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

15 Experts available now in Live!

Get 1:1 Help Now