Link to home
Start Free TrialLog in
Avatar of Nasir Razzaq
Nasir RazzaqFlag for United Kingdom of Great Britain and Northern Ireland

asked on

Exposing an EAV database model as a mart for visualisation

We are using an EAV based data model to store data from files that we read in. All the data is related to a master record. We need to visualise this master record (charts, aggregations, reporting etc) and using EAV based model directly for this would be very slow. We need to pivot EAV into a flat structure where each record is on a single row.

The problem that we have is that the attributes/properties (which are then used as columns in mart) need to be flexible. Is there a way to design a mart which is pivoted as much as possible so is performant but is also flexible in that adding/removing attributes/properties does not require changing mart (and associated ETL).
ASKER CERTIFIED SOLUTION
Avatar of ste5an
ste5an
Flag of Germany image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of Nasir Razzaq

ASKER

Thanks for your response. Number of attributes is a problem as we have more than 2000!
You should evaluate sparse columns (wide tables).
Otherwise you need to group the attributes by some meaning full category and split your dimension into appropriate chunks.
Yes we are using sparse columns already. We do have a mart in the form that you suggested but flexibility/extensibility is a problem. We cannot recreate it using dynamic SQL as we have millions of rows of data and only want to sync latest changes to mart.
Millions of rows in the dimensions?
Fact
Well, you need to group the attributes by the grain of the value. Then you can create junk dimensions with attributes of the same value grain and place the values in one fact row. So you'll get also multiple fact tables. It's basically a manual partitioning of your data.
Another grouping criteria is which attributes are queried together.

btw, just for curiosity: How many distinct entities, distinct attributes and distinct values do you have in your model.

p.s. EAV models should only be used for spares attribute distribution. Maybe this is the problem..
Entities: Millions, attributes=1000s, values=millions (mostly floats)
Thanks