Solved

creating OLAP cube

Posted on 2010-08-29
24
1,140 Views
Last Modified: 2013-11-16
I have a table with a list of contract line items in the form of

CREATE TABLE contracts_lines (
  contract_id integer,
  customer_id integer,
  product_id integer,
  contract_line_start datetime,
  contract_line_end datetime,
  amount float
)
What I would like to produce is a CUBE (that allows me to determine how much revenue I can expect each month - we have a simple rule that each line is recognized evenly over the term of the line (i.e. daily revenue = amount / (contract_line_end - contract_line_start) )

Just newbie to Cube concepts, so far created

month_dim (from 2005 till 2015)
Customer_dim (Cust_id, cust_number, cust_name, cust_add, cust_city, cust_state, cust_country)
product_dim (prod_id, prod_code, prod_name, prod_desc, prod_category)
product_manfct_dim (prod_manfct_id,prod_manfct_name, prod_manfct_desc)
Location_dim (loc_id, loc_state,Loc_country)

Would like
montly_revenue forecast for 12 months=
  location_country
  month
   product
   amount

should there be another dim for contract

please advice, how I can now project this for cube or is it missing more dimensions
0
Comment
Question by:macentrap
  • 15
  • 9
24 Comments
 
LVL 7

Author Comment

by:macentrap
ID: 33555627
Guidance only on OLAP will be gr8
0
 
LVL 10

Expert Comment

by:itcouple
ID: 33557146
Hi

"how much revenue I can expect each month - we have a simple rule that each line is recognized evenly over the term of the line (i.e. daily revenue = amount / (contract_line_end - contract_line_start) )"

I'm fairly new to OLAP as well (3 months) but will try to make suggestions:
1) I suggest to add a field to your contracts_lines called DailyAmount (or create view) might be easier to use it.
2) I understand your create Month_Dim which stores months on the lowest level? If yes I suggest to do Date_Dim with date on lowest level and add month/year as attributes. You will probably have to add it to the cube and join with your date keys. (not sure yet)

With the rest.... I'm just going to create databases and see how this can be done....

Regards
Emil




0
 
LVL 7

Author Comment

by:macentrap
ID: 33557464
hi Itcouple,

just added these today as part of ETL, when I was reviewing it.

created
contract_dim (contract_id, contract_sla, daily_rev, contract_Start, contract_end)

Date_dim (dates starting from 2005 till 2015) including details of quarter and Financial year

Thanks
0
 
LVL 10

Expert Comment

by:itcouple
ID: 33558663
Hi

I played with MDX but no luck. It might be easier to create a table contracts_lines_monthly_Summary and use the same key fields with Month + Year fields and populate data during ETL process. I think that might work.

Regards
Emil
0
 
LVL 7

Author Comment

by:macentrap
ID: 33560409
Hi Emil

how to get monthly summary pls?
0
 
LVL 10

Expert Comment

by:itcouple
ID: 33565147
Hi

Maybe summary is not a good word....

I think the way it might work (not sure if this is the best way to do that... I'll ask my experienced colleague once he is back from holiday :) is to take contractline and if it is 30-Aug-2010 till 12-Dec-2010 with daily value of 10 then in the summary table I would have

ContractLineID, Month, Year, Value
1, 8, 2010, 20
1, 9, 2010, 300
1, 10, 2010, 310
1,11,2010, 300
1, 12, 2010, 120

for this one contract line. I think when you use Date or Month dimension and it is joined with This summary fact table it should sum values correctly for each month/year.

Regards
Emil
0
 
LVL 7

Author Comment

by:macentrap
ID: 33565509
Hi Emil

Thanks,
Then isn't this similar to date dim then....just confused

Will this still require month and date dim

I just started learning analytics ...just week into this.

Regards
0
 
LVL 10

Expert Comment

by:itcouple
ID: 33568147
Hi

Let me think about this one :)

The Date dim has granuality of one day and in most cases 99.999% exist in every OLAP cube. You should also bear in mind that this is 'role playing dimension' which means that it can be joined to different fields like Contract Start Date but also to another date fields in the same table or different tables.

With Month dim you might not need it if you go with dim date which gives month (Usually you have in dimDate, datekey as integer to save space, date as date, week, month, quarter, year, fiscal week,month, year and occasionally other fields)

So the new table let's call it "contract lines split" is first of all not a dimension but fact table and will be joined to Date (in dimension usage tab in cube depending on the field you chose for your role playing date dimension)  and fields will be available in new measure group.

I'm fairly new to cubes so probably my explanations might not be very clear  :) --- and not always correct :p

Regards
Emil
0
 
LVL 7

Author Comment

by:macentrap
ID: 33599141
So in contract_line split fact: I will have to increment the month, from start date to end date for every contract I'd

This will lead to much data handling

It's inventory management for customer
Contract will have many products and other contract will have same product under different startdate and enddate
0
 
LVL 10

Accepted Solution

by:
itcouple earned 500 total points
ID: 33602354
Hi

I have spoken with my colleague and he confirmed building a table with monthly (or daily) data is the way to do it in data warehouse. We also discussed using MDX and I suggested my approach which I couldn't resolve and he pointed me to very useful MDX function and the results look promising just need to work out final bits.

MDX might be easier to implement but might be more complicated then building a table with data that OLAP handles very well.

Here is the split table

CREATE TABLE contracts_lines_split (
  contract_id integer,
  customer_id integer,
  product_id integer,
contract_line_start_year INT,
contract_line_start_month INT,
contract_line_end_year INT,
contract_line_edn_month int
daily_amount decimal(9,3)
)
-- Just a small comment. I would avoid using float. It tends to round figures. If you don't need precision higher than 38 use decimal. In the above example I set it up as precision 9 (<=9 is 5 bytes) and scale of 3 I doubt dialy amount is > 999999 and 3 decimal places should be fine. You can adjust to your needs.

If your average contract length is 24 months then this table will be 24 times bigger than contract lines but it should avoid having to build complex MDX calculations. I wouldn't worry too much about performance OLAP is designed to handle large valumes of data. If you need to increase processing than you can use partitioning which is very common for tables larger than 20 millions rows. once the data is loaded MDX query approach should handle it very well.

Not sure if I answered preciselly your question :) but I hope that helps.

Regards
Emil
0
 
LVL 7

Author Comment

by:macentrap
ID: 33602607
Hi Emil,
 
build the contract_lines_split similar, thank you for pointing on the float values, i got it as (5,2)

on average the contracts are 5 yrs and have minimum of 300 products.

apologies, my knowledge on portioning is novice.

Thanks for taking extra efforts  :)

Regards
Mac
0
 
LVL 10

Expert Comment

by:itcouple
ID: 33603408
Hi Mac,

In that case it might be quite big. The good news is that it sounds like good opportunity to use several interesting features available with SQL Server. How many rows do you expect in contract_lines?

I will also try to write MDX to do that.

Regards
Emil
0
Threat Intelligence Starter Resources

Integrating threat intelligence can be challenging, and not all companies are ready. These resources can help you build awareness and prepare for defense.

 
LVL 7

Author Comment

by:macentrap
ID: 33604803
Hi Emil

In contract_lines there are 5000 rows.

Ps: using mysql and open source software pentaho mdx.

Regards
Mac
0
 
LVL 10

Expert Comment

by:itcouple
ID: 33605838
Hi

300,000 rows should be fine.... I've checked pentaho and it looks like a solid product although I cannot make any comments about it. It will probably be a case of try and see.

Regards
Emil
0
 
LVL 7

Author Comment

by:macentrap
ID: 33605950
Hi Emil

then contract_lines_split with increments till dateEnd is the way to go for now.

I am still not sure... though there must be other better way to avoid that much of data dumping as inventory management database itself is always growing.

ref pentaho: Im ok with the product, something new to try as happen to stumble upon it as part of knowledge enhancement.
0
 
LVL 10

Expert Comment

by:itcouple
ID: 33606022
Hi

Data warehouse and OLAP resolve queries in different way and usually having big tables simplifies the process and it shouldn't the performance a lot as MDX uses "axis coordinates" to locate data and doesn't scan it like T-SQL.

Regards
Emil
0
 
LVL 7

Author Comment

by:macentrap
ID: 33606087
Hi Emil

can you please help me out with the sql query on month increaments..
the procedure which I am using is not working out with desired result....

i get the month sequences happening but doesnot join with contract_lines
0
 
LVL 7

Author Comment

by:macentrap
ID: 33606444
hey got the month sequences going now
0
 
LVL 7

Author Comment

by:macentrap
ID: 33615160
Hi Emil


in contract_dim: does it need reference to customer customer_id, product_id  

or it will be just basic info as discussed before

as when fact: contract_lines_split is created, its reference these other id's

as currently,  every inventory item is termed as contract.
earlier I just picked some information
contract_dim (contract_id, contract_sla, daily_rev, contract_Start, contract_end)


production_contract table has
Contract(contract_id, contract_sla, revenue, contract_Start, contract_end, customer_id, product_id)

Regards
Mac
0
 
LVL 7

Author Comment

by:macentrap
ID: 33624004
Apologies Emil, I must be sleeping  and totally forgot abt ur earlier post.....

0
 
LVL 10

Expert Comment

by:itcouple
ID: 33641469
Hi

Let me know how it goes (BDW I gave up on the MDX I will try to do when I have more free time).

Regards
Emil
0
 
LVL 7

Author Comment

by:macentrap
ID: 33655601
sure will update, just finished doing XML schema
0
 
LVL 7

Author Comment

by:macentrap
ID: 33681594
Hi EMil

just got basic done by Month,

putting all month revenue in fact table,

MDX currently is like

select NON EMPTY Hierarchize(Union(Crossjoin({[Measures].[Revenue]}, {[Location].[All Location]}), Crossjoin({[Measures].[Revenue]}, [Location].[All Location].Children))) ON COLUMNS,
  NON EMPTY Hierarchize(Crossjoin({[Manufacturer.OEM].[All OEM]}, Crossjoin({[Model Type.Model].[All Model]}, Union(Crossjoin({[Account].[All Account]}, {[Date.YQMD].[All Month]}), Crossjoin({[Account].[All Account]}, [Date.YQMD].[All Month].Children))))) ON ROWS
from [Revenue_new]
0
 
LVL 7

Author Closing Comment

by:macentrap
ID: 33698373
Thank You Emil :)
0

Featured Post

Find Ransomware Secrets With All-Source Analysis

Ransomware has become a major concern for organizations; its prevalence has grown due to past successes achieved by threat actors. While each ransomware variant is different, we’ve seen some common tactics and trends used among the authors of the malware.

Join & Write a Comment

Entering time in Microsoft Access can be difficult. An input mask often bothers users more than helping them and won't catch all typing errors. This article shows how to create a textbox for 24-hour time input with full validation politely catching …
I annotated my article on ransomware somewhat extensively, but I keep adding new references and wanted to put a link to the reference library.  Despite all the reference tools I have on hand, it was not easy to find a way to do this easily. I finall…
Video by: Steve
Using examples as well as descriptions, step through each of the common simple join types, explaining differences in syntax, differences in expected outputs and showing how the queries run along with the actual outputs based upon a simple set of dem…
Polish reports in Access so they look terrific. Take yourself to another level. Equations, Back Color, Alternate Back Color. Write easy VBA Code. Tighten space to use less pages. Launch report from a menu, considering criteria only when it is filled…

760 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

23 Experts available now in Live!

Get 1:1 Help Now