Go Premium for a chance to win a PS4. Enter to Win

x
?
Solved

Star schema -- modeling a company hierarchy

Posted on 2013-12-17
10
Medium Priority
?
546 Views
Last Modified: 2013-12-18
I am pretty rusty on star schemas and OLAP.  I haven't worked on something like this in about 5 years, so please excuse a newbie question.

My fact table will come from income statements from a company with a hierarchy of companies.  The piece giving me trouble is modeling that company hierarchy.

Parent Company has X Asset Groups:
Alpha Assets
Bravo Assets
Charlie Assets
Each Asset Group has 2 subsidiaries:
Alpha East
Alpha West
Bravo North
Bravo South
Charlie Legal
Charlie Financial

Naturally, my analysis is required to be able to roll the figures up at any level.

How should I model the company dimension?

The approach I've sketched is:

dim_Company

CompanyID
Name
TypeOfCompany (e.g. Asset Group, Subsidiary, Parent
ParentID (references CompanyID)

I have the feeling that's very wrong ... but can someone steer me straight?

Thanks!
0
Comment
Question by:Daniel Wilson
  • 4
  • 4
  • 2
10 Comments
 
LVL 29

Assisted Solution

by:fibo
fibo earned 400 total points
ID: 39725966
Not sure I understand everything...
I would not place at the same level a company and its parent.
So my hierarchy would be:
(asset) group / subsidiary / data items

This allows to query at the corp / group / company levels

Altough the group is itself techically a company, for management / strategy purposes it is bot at the same level.
0
 
LVL 46

Expert Comment

by:Kent Olsen
ID: 39726491
Hi Daniel,

A Star Schema usually models "activity over time".  It's key component is a time dimension that allows you to report "status" at any point in the data's history.

Based just on what you've posted, you should probably have 3 dimensions.  (And more as other requirements are revealed.)

Your fact table will be the asset information.  Monthly statements, daily changes, etc.  Whatever granularity works for you.  Then you'll have the time dimension, incremented at the granularity of the fact table.  The last two dimensions are Company and Asset.  The Company dimension contains all of the searchable Company items, and the Asset dimension contains all of the searchable asset items.  The Asset dimension may be as simple as asset type and asset name.

You may be tempted to include Asset Owner or Asset Holder in the Asset table, but that's really just Company information, already contained in another table.


Good Luck,
Kent
0
 
LVL 32

Author Comment

by:Daniel Wilson
ID: 39726617
OK, I probably don't need to list the parent company. I think that's what you're telling me, Fibo, and it makes sense.

Kent, you're certainly right that I have a time dimension, with granularity down to the month. My dimension table for that looks like
FullDate -- the only entries being the 1st of every month as that's what I'm loading in my fact table
Month
Quarter
Year

On the company structure, I think change will be sufficiently infrequent that I can get by without a slowly changing dimension.  I worked with those in the past and found them kind of tricky -- probably due to inexperience.

Does this structure look better for the company dimension table?

CompanyID
Name
AssetGroupName

Thanks!
0
Veeam and MySQL: How to Perform Backup & Recovery

MySQL and the MariaDB variant are among the most used databases in Linux environments, and many critical applications support their data on them. Watch this recorded webinar to find out how Veeam Backup & Replication allows you to get consistent backups of MySQL databases.

 
LVL 46

Expert Comment

by:Kent Olsen
ID: 39726638
Hi Daniel,

I don't know enough about your model to say "yes" and instinctively want to say "no".  The granularity just seems wrong.

Can you elaborate on the relationship between the company, asset group, and asset items?
0
 
LVL 32

Author Comment

by:Daniel Wilson
ID: 39726863
I tried to explain the relationship between the companies in the original post.  Maybe if I fill in my new proposed table, what I'm meaning will be more clear.
ID         Name                         AssetGroupName
1          Alpha East                Alpha Asset Group
2          Alpha West              Alpha Asset Group
3          Bravo North            Bravo Asset Group
4          Bravo South            Bravo Asset Group
5          Charlie Legal           Charlie Asset Group
6          Charlie Financial     Charlie Asset Group

This, I think, allows me to roll up all the Bravo asset Group expenses, roll up all expenses for the entire company, or drill down to a single subsidiary like Bravo North.

Am I still missing it?
0
 
LVL 46

Expert Comment

by:Kent Olsen
ID: 39726929
Ok, thanks.  That's pretty much what I was expecting to see.


The finest granularity in the example is the asset group.  Each one is controlled by 1 or more companies.  In a star schema, that's two dimensions.  In a pseudo-star (please forgive the term) you almost have a dimension of a dimension.  (Yucko....)

You've got a granularity where an asset group is owned or controlled by multiple companies.  The database really should have a dimension on the asset group and another on the company.  It's much more flexible.  I've added three lines to your example.  If Company and Assets are separate dimensions, the addition 3 items are trivial to incorporate.  If Company and Assets are tightly coupled in the same table, this is a lot tougher and you actually lose some of the performance benefits of the start schema.

ID         Name                         AssetGroupName
1          Alpha East                Alpha Asset Group
2          Alpha West              Alpha Asset Group
3          Bravo North            Bravo Asset Group
4          Bravo South            Bravo Asset Group
5          Charlie Legal           Charlie Asset Group
6          Charlie Financial     Charlie Asset Group
7          Charlie Legal           Charlie Retirement Assets
8          Charlie Legal           Alpha Charlie joint Assets
9          Alpha East               Alpha Charlie joint Assets
0
 
LVL 32

Author Comment

by:Daniel Wilson
ID: 39727111
OK, so 2 tables?  A bit more like snowflake than star?

Subsidiaries
ID       Name                           AssetGroupID
1             Alpha East                   1
...
6             Charlie Legal               3

AssetGroups
ID             Name
1               Alpha Asset Group
2               Bravo Asset Group
3               Charlie Asset Group
0
 
LVL 46

Accepted Solution

by:
Kent Olsen earned 1600 total points
ID: 39727418
Not quite.  The Subsidiaries and Assets are in different tables.

Subsidiaries
ID       Name    
                     
1             Alpha East                  
...
6             Charlie Legal              

AssetGroups
ID             Name

1               Alpha Asset Group
2               Bravo Asset Group
3               Charlie Asset Group

Facts
CompanyID    integer
SubsidiaryID  integer
...

Then the query joins both Assets and Subsidiaries to the Facts and filters on the desired values

SELECT * FROM FACTS t0
INNER JOIN subsidiaries  t1
  ON t0.CompanyID = t1.ID
INNER JOIN assets t2
  ON t0.AssetsID = t2.ID
WHERE t1.name = 'Charlie Legal'
  AND t2.name = 'Charlie Asset Group';

That allows you to easily report on all of the Charlie Subsidiaries, all of the Charlie assets, or a specific subsidiary/asset.  It's also very efficient as most modern DBMS do a "star join" where the to dimension tables are filtered, and the results inner joined before the data in the Fact table is accessed.  If your query doesn't need the data in the fact table, (i.e. count(*)) the fact table is never read.  DB2 has had that since version 7 and most other engines have since copied it.


Kent
0
 
LVL 32

Author Closing Comment

by:Daniel Wilson
ID: 39727443
Thanks!
0
 
LVL 29

Expert Comment

by:fibo
ID: 39727515
B-) Glad we could help. Thx for the grade and points!
0

Featured Post

[Webinar] Cloud Security

In this webinar you will learn:

-Why existing firewall and DMZ architectures are not suited for securing cloud applications
-How to make your enterprise “Cloud Ready”, and fix your aging DMZ architecture
-How to transform your enterprise and become a Cloud Enabler

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Microsoft Access is a place to store data within tables and represent this stored data using multiple database objects such as in form of macros, forms, reports, etc. After a MS Access database is created there is need to improve the performance and…
In this blog post, we’ll look at how using thread_statistics can cause high memory usage.
In this video, Percona Director of Solution Engineering Jon Tobin discusses the function and features of Percona Server for MongoDB. How Percona can help Percona can help you determine if Percona Server for MongoDB is the right solution for …
Despite its rising prevalence in the business world, "the cloud" is still misunderstood. Some companies still believe common misconceptions about lack of security in cloud solutions and many misuses of cloud storage options still occur every day. …
Suggested Courses

916 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question