Tech or Treat! Write an article about your scariest tech disaster to win gadgets!Learn more

x
?
Solved

Wide fact table vs. multiple fact tables

Posted on 2008-06-24
4
Medium Priority
?
1,541 Views
Last Modified: 2013-11-16
I am currently developing a data warehouse for a telecom company.
We collect several different measures (100+) from the network elements on an hourly basis.
I'm wondering what the optimal data warehouse design would be:
1) Since all measures bear the same granularity (1 hour), and originate from similar network elements, it would make sense to put them all in the same fact table. I would therefore end up with a fact table composed of about 5 dimensions and 100+ fact fields. Does a 100+ field table sound like an acceptable design?
2) I could create several fact tables with each about 5 dimensions and each only a subset of the 100+ facts, grouped in a way likely to be queried by the end users. I would therefore have several fact tables with maximum 25 fact fields (which seems more reasonable?), but I would loose some flexibility in the sense that users could no longer query the data in ways that were unplanned or unexpected.
I would greatly appreciate any advice or comment or field experience with the 2 options proposed (or a 3rd alternative...).
0
Comment
Question by:optima-sc
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 2
  • 2
4 Comments
 
LVL 19

Expert Comment

by:frankytee
ID: 21862022
would any user need to regularly access the 100 fields in a query?
fact tables should only contain related measures so i think option 2) is the way to go. group your measures by category into several fact tables.
just ensure that your fact tables can be joined together using a common id field etc or a "bridge"/mapping table
0
 

Author Comment

by:optima-sc
ID: 21863022
Well, I would not say that any user would need to 'regularly' access any of the 100 fields.
However, it is quite possible that, at certain points in time, they will need this kind of flexibility for troubleshooting a particular network element or do root cause analysis.
I guess the idea of grouping the facts by likely categories sounds good, however, isn't that potentially going to lead to very slow (and heavy) cross-table queries whenever more flexibility or unexpected analysis is needed from the end user?
0
 
LVL 19

Accepted Solution

by:
frankytee earned 1500 total points
ID: 21863885
if you index your tables appropriately then you should get reasonable/good performance. and the only time you need to join the tables is when the user needs measures from both multiple fact tables which should be "infrequent" if the measures are not related. most data warehouse that i have seen have multiple fact tables not one huge one with 100 measures.
but dont take my opinion as gospel, create and test the 2 different models to see which one is better for you.
0
 

Author Comment

by:optima-sc
ID: 21868082
Well, after re-discussing this topic thoroughly with my client, it appears that the 'unexpected' queries might not be so infrequent. Since the wide fact table also has the advantage of simpler applications no matter what the end user choice is, we decided to go for that solution.
Thank you for taking the time to give me your advice, anyway. I'll award you the points.
0

Featured Post

NFR key for Veeam Backup for Microsoft Office 365

Veeam is happy to provide a free NFR license (for 1 year, up to 10 users). This license allows for the non‑production use of Veeam Backup for Microsoft Office 365 in your home lab without any feature limitations.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

A Stored Procedure in Microsoft SQL Server is a powerful feature that it can be used to execute the Data Manipulation Language (DML) or Data Definition Language (DDL). Depending on business requirements, a single Stored Procedure can return differe…
In this article, we’ll look at how to deploy ProxySQL.
This is a high-level webinar that covers the history of enterprise open source database use. It addresses both the advantages companies see in using open source database technologies, as well as the fears and reservations they might have. In this…
In this video, Percona Director of Solution Engineering Jon Tobin discusses the function and features of Percona Server for MongoDB. How Percona can help Percona can help you determine if Percona Server for MongoDB is the right solution for …
Suggested Courses

647 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question