[Webinar] Learn how to a build a cloud-first strategyRegister Now

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 1152
  • Last Modified:

creating OLAP cube

I have a table with a list of contract line items in the form of

CREATE TABLE contracts_lines (
  contract_id integer,
  customer_id integer,
  product_id integer,
  contract_line_start datetime,
  contract_line_end datetime,
  amount float
)
What I would like to produce is a CUBE (that allows me to determine how much revenue I can expect each month - we have a simple rule that each line is recognized evenly over the term of the line (i.e. daily revenue = amount / (contract_line_end - contract_line_start) )

Just newbie to Cube concepts, so far created

month_dim (from 2005 till 2015)
Customer_dim (Cust_id, cust_number, cust_name, cust_add, cust_city, cust_state, cust_country)
product_dim (prod_id, prod_code, prod_name, prod_desc, prod_category)
product_manfct_dim (prod_manfct_id,prod_manfct_name, prod_manfct_desc)
Location_dim (loc_id, loc_state,Loc_country)

Would like
montly_revenue forecast for 12 months=
  location_country
  month
   product
   amount

should there be another dim for contract

please advice, how I can now project this for cube or is it missing more dimensions
0
macentrap
Asked:
macentrap
  • 15
  • 9
1 Solution
 
macentrapAuthor Commented:
Guidance only on OLAP will be gr8
0
 
itcoupleCommented:
Hi

"how much revenue I can expect each month - we have a simple rule that each line is recognized evenly over the term of the line (i.e. daily revenue = amount / (contract_line_end - contract_line_start) )"

I'm fairly new to OLAP as well (3 months) but will try to make suggestions:
1) I suggest to add a field to your contracts_lines called DailyAmount (or create view) might be easier to use it.
2) I understand your create Month_Dim which stores months on the lowest level? If yes I suggest to do Date_Dim with date on lowest level and add month/year as attributes. You will probably have to add it to the cube and join with your date keys. (not sure yet)

With the rest.... I'm just going to create databases and see how this can be done....

Regards
Emil




0
 
macentrapAuthor Commented:
hi Itcouple,

just added these today as part of ETL, when I was reviewing it.

created
contract_dim (contract_id, contract_sla, daily_rev, contract_Start, contract_end)

Date_dim (dates starting from 2005 till 2015) including details of quarter and Financial year

Thanks
0
NEW Veeam Agent for Microsoft Windows

Backup and recover physical and cloud-based servers and workstations, as well as endpoint devices that belong to remote users. Avoid downtime and data loss quickly and easily for Windows-based physical or public cloud-based workloads!

 
itcoupleCommented:
Hi

I played with MDX but no luck. It might be easier to create a table contracts_lines_monthly_Summary and use the same key fields with Month + Year fields and populate data during ETL process. I think that might work.

Regards
Emil
0
 
macentrapAuthor Commented:
Hi Emil

how to get monthly summary pls?
0
 
itcoupleCommented:
Hi

Maybe summary is not a good word....

I think the way it might work (not sure if this is the best way to do that... I'll ask my experienced colleague once he is back from holiday :) is to take contractline and if it is 30-Aug-2010 till 12-Dec-2010 with daily value of 10 then in the summary table I would have

ContractLineID, Month, Year, Value
1, 8, 2010, 20
1, 9, 2010, 300
1, 10, 2010, 310
1,11,2010, 300
1, 12, 2010, 120

for this one contract line. I think when you use Date or Month dimension and it is joined with This summary fact table it should sum values correctly for each month/year.

Regards
Emil
0
 
macentrapAuthor Commented:
Hi Emil

Thanks,
Then isn't this similar to date dim then....just confused

Will this still require month and date dim

I just started learning analytics ...just week into this.

Regards
0
 
itcoupleCommented:
Hi

Let me think about this one :)

The Date dim has granuality of one day and in most cases 99.999% exist in every OLAP cube. You should also bear in mind that this is 'role playing dimension' which means that it can be joined to different fields like Contract Start Date but also to another date fields in the same table or different tables.

With Month dim you might not need it if you go with dim date which gives month (Usually you have in dimDate, datekey as integer to save space, date as date, week, month, quarter, year, fiscal week,month, year and occasionally other fields)

So the new table let's call it "contract lines split" is first of all not a dimension but fact table and will be joined to Date (in dimension usage tab in cube depending on the field you chose for your role playing date dimension)  and fields will be available in new measure group.

I'm fairly new to cubes so probably my explanations might not be very clear  :) --- and not always correct :p

Regards
Emil
0
 
macentrapAuthor Commented:
So in contract_line split fact: I will have to increment the month, from start date to end date for every contract I'd

This will lead to much data handling

It's inventory management for customer
Contract will have many products and other contract will have same product under different startdate and enddate
0
 
itcoupleCommented:
Hi

I have spoken with my colleague and he confirmed building a table with monthly (or daily) data is the way to do it in data warehouse. We also discussed using MDX and I suggested my approach which I couldn't resolve and he pointed me to very useful MDX function and the results look promising just need to work out final bits.

MDX might be easier to implement but might be more complicated then building a table with data that OLAP handles very well.

Here is the split table

CREATE TABLE contracts_lines_split (
  contract_id integer,
  customer_id integer,
  product_id integer,
contract_line_start_year INT,
contract_line_start_month INT,
contract_line_end_year INT,
contract_line_edn_month int
daily_amount decimal(9,3)
)
-- Just a small comment. I would avoid using float. It tends to round figures. If you don't need precision higher than 38 use decimal. In the above example I set it up as precision 9 (<=9 is 5 bytes) and scale of 3 I doubt dialy amount is > 999999 and 3 decimal places should be fine. You can adjust to your needs.

If your average contract length is 24 months then this table will be 24 times bigger than contract lines but it should avoid having to build complex MDX calculations. I wouldn't worry too much about performance OLAP is designed to handle large valumes of data. If you need to increase processing than you can use partitioning which is very common for tables larger than 20 millions rows. once the data is loaded MDX query approach should handle it very well.

Not sure if I answered preciselly your question :) but I hope that helps.

Regards
Emil
0
 
macentrapAuthor Commented:
Hi Emil,
 
build the contract_lines_split similar, thank you for pointing on the float values, i got it as (5,2)

on average the contracts are 5 yrs and have minimum of 300 products.

apologies, my knowledge on portioning is novice.

Thanks for taking extra efforts  :)

Regards
Mac
0
 
itcoupleCommented:
Hi Mac,

In that case it might be quite big. The good news is that it sounds like good opportunity to use several interesting features available with SQL Server. How many rows do you expect in contract_lines?

I will also try to write MDX to do that.

Regards
Emil
0
 
macentrapAuthor Commented:
Hi Emil

In contract_lines there are 5000 rows.

Ps: using mysql and open source software pentaho mdx.

Regards
Mac
0
 
itcoupleCommented:
Hi

300,000 rows should be fine.... I've checked pentaho and it looks like a solid product although I cannot make any comments about it. It will probably be a case of try and see.

Regards
Emil
0
 
macentrapAuthor Commented:
Hi Emil

then contract_lines_split with increments till dateEnd is the way to go for now.

I am still not sure... though there must be other better way to avoid that much of data dumping as inventory management database itself is always growing.

ref pentaho: Im ok with the product, something new to try as happen to stumble upon it as part of knowledge enhancement.
0
 
itcoupleCommented:
Hi

Data warehouse and OLAP resolve queries in different way and usually having big tables simplifies the process and it shouldn't the performance a lot as MDX uses "axis coordinates" to locate data and doesn't scan it like T-SQL.

Regards
Emil
0
 
macentrapAuthor Commented:
Hi Emil

can you please help me out with the sql query on month increaments..
the procedure which I am using is not working out with desired result....

i get the month sequences happening but doesnot join with contract_lines
0
 
macentrapAuthor Commented:
hey got the month sequences going now
0
 
macentrapAuthor Commented:
Hi Emil


in contract_dim: does it need reference to customer customer_id, product_id  

or it will be just basic info as discussed before

as when fact: contract_lines_split is created, its reference these other id's

as currently,  every inventory item is termed as contract.
earlier I just picked some information
contract_dim (contract_id, contract_sla, daily_rev, contract_Start, contract_end)


production_contract table has
Contract(contract_id, contract_sla, revenue, contract_Start, contract_end, customer_id, product_id)

Regards
Mac
0
 
macentrapAuthor Commented:
Apologies Emil, I must be sleeping  and totally forgot abt ur earlier post.....

0
 
itcoupleCommented:
Hi

Let me know how it goes (BDW I gave up on the MDX I will try to do when I have more free time).

Regards
Emil
0
 
macentrapAuthor Commented:
sure will update, just finished doing XML schema
0
 
macentrapAuthor Commented:
Hi EMil

just got basic done by Month,

putting all month revenue in fact table,

MDX currently is like

select NON EMPTY Hierarchize(Union(Crossjoin({[Measures].[Revenue]}, {[Location].[All Location]}), Crossjoin({[Measures].[Revenue]}, [Location].[All Location].Children))) ON COLUMNS,
  NON EMPTY Hierarchize(Crossjoin({[Manufacturer.OEM].[All OEM]}, Crossjoin({[Model Type.Model].[All Model]}, Union(Crossjoin({[Account].[All Account]}, {[Date.YQMD].[All Month]}), Crossjoin({[Account].[All Account]}, [Date.YQMD].[All Month].Children))))) ON ROWS
from [Revenue_new]
0
 
macentrapAuthor Commented:
Thank You Emil :)
0

Featured Post

Prep for the ITIL® Foundation Certification Exam

December’s Course of the Month is now available! Enroll to learn ITIL® Foundation best practices for delivering IT services effectively and efficiently.

  • 15
  • 9
Tackle projects and never again get stuck behind a technical roadblock.
Join Now