Solved

Cube size; why is mine so small?

Posted on 2007-11-29
8
103 Views
Last Modified: 2016-02-13
I have an Analysis Server Database that contains 5 cubes.  Its 8gb and we're having a few speed problems with it.  However, i've seen reports all over the net that much larger cubes (100gb+) can work without a problem so I have been asked to increase the size of the cube before working out why its slow, as there is no point fixing the current speed issues if we find performance becomes unreasonabler again when we add more data, even though it should be able to handle it (make sense?  If not post & i'll try and explain again).

However, its not that simple is it?  What makes someones Cube 100gb?  Is it just pure data?  Aggregations?  Dimensions?  

I'm after some comparisons, so if anyone can take a look at one of their analysis server databases and post an analysis of where the disk space goes that would be very helpful!

The fact table in the biggest cube has 1.4 million rows in it and comes to about 1gb.  3 of the other cubes are about 500mb each.  1 is a whole 84mb.
There are 60 dimensions.  Most are only a few mb, the biggest is 1.6gb
I don't think there are any aggregations (too many dimensions I think & the wizard didn't bother - not 100% sure as I didnt make it).

From your answers I hope to be able to bloat my cube to a larger size that we need it to be so I can prove that it won't fail 6 months down the line.
Note: I tried increasing the main fact table to 30 million rows and it didn't bloat it that much.
0
Comment
Question by:kinton
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 5
  • 3
8 Comments
 
LVL 2

Author Comment

by:kinton
ID: 20372496
p.s. i'll split the points accordingly :)
0
 
LVL 18

Accepted Solution

by:
PFrog earned 500 total points
ID: 20372659
The key here is amost certainly aggregations.

The primary speed benefit of a cube comes from aggregations. if you want to find the sum of x for 1.4M rows, SQL has to source those 1.4M rows then add up all the data - slow. SSAS (if the aggregations are designed properly) will already have precalculated the result, so when you query it it just returns it.
If you have no aggregations then you will see no performance benefit.

Aggregations also account for the majority of a cube's size. Only 1000 rows in each of 3 dimensions, creates a possible 1,000,000,000 combinations of members. Most of these are not going to be used, so SSAS doesn't store them all, but you can see how a cube can quickly grow.

I would suggest two courses of action
1) Design your aggregations properly, and as many of them as you can.
2) Cubes tend to grow mostly due to the time dimension, usually other dimensions (customer, city, product etc.) grow at a much smaller rate. You can therefore relatively easily take last year's data, change the date to next year and thus create valid business data for future years. This will enable you to generate data for as many years as you want in order to test the speed. However, this will be of no use if you do ot have appropriate aggregations.
0
 
LVL 2

Author Comment

by:kinton
ID: 20374579
Thanks PFrog, I don't suppose you could elaborate on design my aggregations properly?  I can only seem to get to aggregations in the Wizard, however, I have started looking at the query logging and usage based optimisation and tailoring the Aggregations wizard around that.  

I'm sure theres a art to getting the best out of the wizard, but am I also missing a manual bit for building my own aggregations or something?
0
PeopleSoft Has Never Been Easier

PeopleSoft Adoption Made Smooth & Simple!

On-The-Job Training Is made Intuitive & Easy With WalkMe's On-Screen Guidance Tool.  Claim Your Free WalkMe Account Now

 
LVL 2

Author Comment

by:kinton
ID: 20374586
p.s. I don't often travel to your side of the country, but next time I do, I sooooooooo need to buy you a beer...or few!
0
 
LVL 18

Expert Comment

by:PFrog
ID: 20374737
In SQL 2005 you can customise the aggregations, but it's a nasty back end hack to do it, you pretty much just have to use the wizard. In SQL 2008 we'll have muh greater control over which aggregations are used, but that's still a few months off...

Does the wizard not work for you? I've not had any problems with it before now... What's it doing?
0
 
LVL 2

Author Comment

by:kinton
ID: 20374945
If you still have 2 fact tables and 20 dimensions should it still find a load of aggregations if you're only using a sample of the data?
While i've been learning how to use the various bits i've been using a demo version thats only about 500mb instead of 8gb.  Its the same design, just with less data.  Could that cause it to have less aggregations?  I set the wizard to run to 65%.  It found 225 (about 4mb worth), didn't even register on the graph, then stopped.

I plan on running it on a copy of the main database tomorrow, but the server was busy running something else so I didn't want to kill it by restoring a backup today.
0
 
LVL 18

Expert Comment

by:PFrog
ID: 20377241
The number of aggregations is not really based on the facts, but the combination of attributes, dimension members and hierarchies.
If your sample data set still has the same (or similar) number of dimension members then it shoud be able to create the necessary aggregations.

If the number of used combinations of dimension attributes are very small, then it could well be that 225 is a reasonable number, it does very much depend on the specific structure of the cube. What happens if you re-run the aggregation wizard but just let it run (select 'until you click stop') - does that find more?

It is difficult to comment in more detail without seeing the actual cube and how it has been designed.
0
 
LVL 2

Author Comment

by:kinton
ID: 20381037
No, it gets to 225 and then stops, so from that I take it that there is only actually 225 aggregations.

I have restored the main database to a backup and i'm currently making a table to build the fact table from.  It usually builds it from a view that has a large number of case & convert statements and a number of joins which probably explains the memory errors I often get so I thought i'd build a single table and using table binding instead - anyway, thats a whole different story.

When thats finished i'll have a go at building some aggregations on that and see what it returns.

I should point out that my 500mb cube/225 aggregations run was done on version 2 (rather that the one in my original post) of my ssas database which isn't being used by customers yet, however instead of 5 cubes, theres only 1.  
It has 2 fact tables (17 measures & 3 measures respectively)
Dimensions have been reduced to 25, with between 2 and 14 attributes in each.
The Dimensions are used in 46 hierarchies - basically there are a load of Dimensions that link to different places in the fact table i.e. The Issue dimension links on last issue sent, start issue and lapse issue.

I'll post back when i've run the aggregations on cube when i have the main data in it, but thats likely to take a few hours to populate!
0

Featured Post

Free learning courses: Active Directory Deep Dive

Get a firm grasp on your IT environment when you learn Active Directory best practices with Veeam! Watch all, or choose any amount, of this three-part webinar series to improve your skills. From the basics to virtualization and backup, we got you covered.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Many companies are looking to get out of the datacenter business and to services like Microsoft Azure to provide Infrastructure as a Service (IaaS) solutions for legacy client server workloads, rather than continuing to make capital investments in h…
Recently we ran in to an issue while running some SQL jobs where we were trying to process the cubes.  We got an error saying failure stating 'NT SERVICE\SQLSERVERAGENT does not have access to Analysis Services. So this is a way to automate that wit…
This video shows how to set up a shell script to accept a positional parameter when called, pass that to a SQL script, accept the output from the statement back and then manipulate it in the Shell.
Via a live example, show how to backup a database, simulate a failure backup the tail of the database transaction log and perform the restore.

734 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question