Cube size; why is mine so small?

Posted on 2007-11-29
Last Modified: 2016-02-13
I have an Analysis Server Database that contains 5 cubes.  Its 8gb and we're having a few speed problems with it.  However, i've seen reports all over the net that much larger cubes (100gb+) can work without a problem so I have been asked to increase the size of the cube before working out why its slow, as there is no point fixing the current speed issues if we find performance becomes unreasonabler again when we add more data, even though it should be able to handle it (make sense?  If not post & i'll try and explain again).

However, its not that simple is it?  What makes someones Cube 100gb?  Is it just pure data?  Aggregations?  Dimensions?  

I'm after some comparisons, so if anyone can take a look at one of their analysis server databases and post an analysis of where the disk space goes that would be very helpful!

The fact table in the biggest cube has 1.4 million rows in it and comes to about 1gb.  3 of the other cubes are about 500mb each.  1 is a whole 84mb.
There are 60 dimensions.  Most are only a few mb, the biggest is 1.6gb
I don't think there are any aggregations (too many dimensions I think & the wizard didn't bother - not 100% sure as I didnt make it).

From your answers I hope to be able to bloat my cube to a larger size that we need it to be so I can prove that it won't fail 6 months down the line.
Note: I tried increasing the main fact table to 30 million rows and it didn't bloat it that much.
Question by:kinton
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 5
  • 3

Author Comment

ID: 20372496
p.s. i'll split the points accordingly :)
LVL 18

Accepted Solution

PFrog earned 500 total points
ID: 20372659
The key here is amost certainly aggregations.

The primary speed benefit of a cube comes from aggregations. if you want to find the sum of x for 1.4M rows, SQL has to source those 1.4M rows then add up all the data - slow. SSAS (if the aggregations are designed properly) will already have precalculated the result, so when you query it it just returns it.
If you have no aggregations then you will see no performance benefit.

Aggregations also account for the majority of a cube's size. Only 1000 rows in each of 3 dimensions, creates a possible 1,000,000,000 combinations of members. Most of these are not going to be used, so SSAS doesn't store them all, but you can see how a cube can quickly grow.

I would suggest two courses of action
1) Design your aggregations properly, and as many of them as you can.
2) Cubes tend to grow mostly due to the time dimension, usually other dimensions (customer, city, product etc.) grow at a much smaller rate. You can therefore relatively easily take last year's data, change the date to next year and thus create valid business data for future years. This will enable you to generate data for as many years as you want in order to test the speed. However, this will be of no use if you do ot have appropriate aggregations.

Author Comment

ID: 20374579
Thanks PFrog, I don't suppose you could elaborate on design my aggregations properly?  I can only seem to get to aggregations in the Wizard, however, I have started looking at the query logging and usage based optimisation and tailoring the Aggregations wizard around that.  

I'm sure theres a art to getting the best out of the wizard, but am I also missing a manual bit for building my own aggregations or something?
Edgartown IT Case Study

Learn about Edgartown's quest to ensure the safety and security of the entire town's employee and citizen data. Read the case study!


Author Comment

ID: 20374586
p.s. I don't often travel to your side of the country, but next time I do, I sooooooooo need to buy you a beer...or few!
LVL 18

Expert Comment

ID: 20374737
In SQL 2005 you can customise the aggregations, but it's a nasty back end hack to do it, you pretty much just have to use the wizard. In SQL 2008 we'll have muh greater control over which aggregations are used, but that's still a few months off...

Does the wizard not work for you? I've not had any problems with it before now... What's it doing?

Author Comment

ID: 20374945
If you still have 2 fact tables and 20 dimensions should it still find a load of aggregations if you're only using a sample of the data?
While i've been learning how to use the various bits i've been using a demo version thats only about 500mb instead of 8gb.  Its the same design, just with less data.  Could that cause it to have less aggregations?  I set the wizard to run to 65%.  It found 225 (about 4mb worth), didn't even register on the graph, then stopped.

I plan on running it on a copy of the main database tomorrow, but the server was busy running something else so I didn't want to kill it by restoring a backup today.
LVL 18

Expert Comment

ID: 20377241
The number of aggregations is not really based on the facts, but the combination of attributes, dimension members and hierarchies.
If your sample data set still has the same (or similar) number of dimension members then it shoud be able to create the necessary aggregations.

If the number of used combinations of dimension attributes are very small, then it could well be that 225 is a reasonable number, it does very much depend on the specific structure of the cube. What happens if you re-run the aggregation wizard but just let it run (select 'until you click stop') - does that find more?

It is difficult to comment in more detail without seeing the actual cube and how it has been designed.

Author Comment

ID: 20381037
No, it gets to 225 and then stops, so from that I take it that there is only actually 225 aggregations.

I have restored the main database to a backup and i'm currently making a table to build the fact table from.  It usually builds it from a view that has a large number of case & convert statements and a number of joins which probably explains the memory errors I often get so I thought i'd build a single table and using table binding instead - anyway, thats a whole different story.

When thats finished i'll have a go at building some aggregations on that and see what it returns.

I should point out that my 500mb cube/225 aggregations run was done on version 2 (rather that the one in my original post) of my ssas database which isn't being used by customers yet, however instead of 5 cubes, theres only 1.  
It has 2 fact tables (17 measures & 3 measures respectively)
Dimensions have been reduced to 25, with between 2 and 14 attributes in each.
The Dimensions are used in 46 hierarchies - basically there are a load of Dimensions that link to different places in the fact table i.e. The Issue dimension links on last issue sent, start issue and lapse issue.

I'll post back when i've run the aggregations on cube when i have the main data in it, but thats likely to take a few hours to populate!

Featured Post

Free Tool: SSL Checker

Scans your site and returns information about your SSL implementation and certificate. Helpful for debugging and validating your SSL configuration.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

As technology users and professionals, we’re always learning. Our universal interest in advancing our knowledge of the trade is unmatched by most industries. It’s a curiosity that makes sense, given the climate of change. Within that, there lies a…
Recently, Microsoft released a best-practice guide for securing Active Directory. It's a whopping 300+ pages long. Those of us tasked with securing our company’s databases and systems would, ideally, have time to devote to learning the ins and outs…
Via a live example, show how to extract information from SQL Server on Database, Connection and Server properties
Using examples as well as descriptions, and references to Books Online, show the different Recovery Models available in SQL Server and explain, as well as show how full, differential and transaction log backups are performed

695 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question