Cube size; why is mine so small?

I have an Analysis Server Database that contains 5 cubes.  Its 8gb and we're having a few speed problems with it.  However, i've seen reports all over the net that much larger cubes (100gb+) can work without a problem so I have been asked to increase the size of the cube before working out why its slow, as there is no point fixing the current speed issues if we find performance becomes unreasonabler again when we add more data, even though it should be able to handle it (make sense?  If not post & i'll try and explain again).

However, its not that simple is it?  What makes someones Cube 100gb?  Is it just pure data?  Aggregations?  Dimensions?  

I'm after some comparisons, so if anyone can take a look at one of their analysis server databases and post an analysis of where the disk space goes that would be very helpful!

The fact table in the biggest cube has 1.4 million rows in it and comes to about 1gb.  3 of the other cubes are about 500mb each.  1 is a whole 84mb.
There are 60 dimensions.  Most are only a few mb, the biggest is 1.6gb
I don't think there are any aggregations (too many dimensions I think & the wizard didn't bother - not 100% sure as I didnt make it).

From your answers I hope to be able to bloat my cube to a larger size that we need it to be so I can prove that it won't fail 6 months down the line.
Note: I tried increasing the main fact table to 30 million rows and it didn't bloat it that much.
Who is Participating?
PFrogConnect With a Mentor Commented:
The key here is amost certainly aggregations.

The primary speed benefit of a cube comes from aggregations. if you want to find the sum of x for 1.4M rows, SQL has to source those 1.4M rows then add up all the data - slow. SSAS (if the aggregations are designed properly) will already have precalculated the result, so when you query it it just returns it.
If you have no aggregations then you will see no performance benefit.

Aggregations also account for the majority of a cube's size. Only 1000 rows in each of 3 dimensions, creates a possible 1,000,000,000 combinations of members. Most of these are not going to be used, so SSAS doesn't store them all, but you can see how a cube can quickly grow.

I would suggest two courses of action
1) Design your aggregations properly, and as many of them as you can.
2) Cubes tend to grow mostly due to the time dimension, usually other dimensions (customer, city, product etc.) grow at a much smaller rate. You can therefore relatively easily take last year's data, change the date to next year and thus create valid business data for future years. This will enable you to generate data for as many years as you want in order to test the speed. However, this will be of no use if you do ot have appropriate aggregations.
kintonAuthor Commented:
p.s. i'll split the points accordingly :)
kintonAuthor Commented:
Thanks PFrog, I don't suppose you could elaborate on design my aggregations properly?  I can only seem to get to aggregations in the Wizard, however, I have started looking at the query logging and usage based optimisation and tailoring the Aggregations wizard around that.  

I'm sure theres a art to getting the best out of the wizard, but am I also missing a manual bit for building my own aggregations or something?
What Kind of Coding Program is Right for You?

There are many ways to learn to code these days. From coding bootcamps like Flatiron School to online courses to totally free beginner resources. The best way to learn to code depends on many factors, but the most important one is you. See what course is best for you.

kintonAuthor Commented:
p.s. I don't often travel to your side of the country, but next time I do, I sooooooooo need to buy you a beer...or few!
In SQL 2005 you can customise the aggregations, but it's a nasty back end hack to do it, you pretty much just have to use the wizard. In SQL 2008 we'll have muh greater control over which aggregations are used, but that's still a few months off...

Does the wizard not work for you? I've not had any problems with it before now... What's it doing?
kintonAuthor Commented:
If you still have 2 fact tables and 20 dimensions should it still find a load of aggregations if you're only using a sample of the data?
While i've been learning how to use the various bits i've been using a demo version thats only about 500mb instead of 8gb.  Its the same design, just with less data.  Could that cause it to have less aggregations?  I set the wizard to run to 65%.  It found 225 (about 4mb worth), didn't even register on the graph, then stopped.

I plan on running it on a copy of the main database tomorrow, but the server was busy running something else so I didn't want to kill it by restoring a backup today.
The number of aggregations is not really based on the facts, but the combination of attributes, dimension members and hierarchies.
If your sample data set still has the same (or similar) number of dimension members then it shoud be able to create the necessary aggregations.

If the number of used combinations of dimension attributes are very small, then it could well be that 225 is a reasonable number, it does very much depend on the specific structure of the cube. What happens if you re-run the aggregation wizard but just let it run (select 'until you click stop') - does that find more?

It is difficult to comment in more detail without seeing the actual cube and how it has been designed.
kintonAuthor Commented:
No, it gets to 225 and then stops, so from that I take it that there is only actually 225 aggregations.

I have restored the main database to a backup and i'm currently making a table to build the fact table from.  It usually builds it from a view that has a large number of case & convert statements and a number of joins which probably explains the memory errors I often get so I thought i'd build a single table and using table binding instead - anyway, thats a whole different story.

When thats finished i'll have a go at building some aggregations on that and see what it returns.

I should point out that my 500mb cube/225 aggregations run was done on version 2 (rather that the one in my original post) of my ssas database which isn't being used by customers yet, however instead of 5 cubes, theres only 1.  
It has 2 fact tables (17 measures & 3 measures respectively)
Dimensions have been reduced to 25, with between 2 and 14 attributes in each.
The Dimensions are used in 46 hierarchies - basically there are a load of Dimensions that link to different places in the fact table i.e. The Issue dimension links on last issue sent, start issue and lapse issue.

I'll post back when i've run the aggregations on cube when i have the main data in it, but thats likely to take a few hours to populate!
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.