Dimensional Data Model Help

I am trying to break a large data set that contains purchase information into a dimensional model. I have various attributes related to the Item purchased.  I do have an Item Number but it is not unique across different manufacturers. The Item is unique by MFR_ID and Item_ID.   I have a Manufacturer dimension that contains a MFR_ID and MFR_NAME.    

In my Item dimension, would it be best to include both these fields as separate attributes in this dimension or should I concatenate them to create one unique item identifier?
d1cjm1exAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Tim HumphriesDirectorCommented:
You should create a field that forms a unique identifier for each member of the Item dimension. It sounds as though this should comprise both the manufacturer id and item id, although you could use an existing unique ID field from your source database if it has one.
Manfacturer could form a level in one of your hierarchies within the Item dimension, as well as product type / classification etc. - this could remove the need the need to have separate dimensions for Maufacturer and product, unless you have a specific need for them.
grzegorzsCommented:
As an alternative you can select MFR_ID and Item_ID as composite key containing two columns (you can do this in the data source view definition).
IMHO if you want to have two separate dimensions (Manufacturer and Product) you shouldn't use MFR_ID as an attribute in the Product dimension - users will have two MFR_ID attributes - one in the Manufacturer dimension and one in the Product diemnsion. It could be confusing.
Of course if you plan to have a hierarchy Manufacturer --> Product you can create one dimension containing manufacturer attributes and product attributes. Then you have to set proper attribute relationships.

HTH
Best regards
Grzegorz

nmcdermaidCommented:
If you want to do this 'properly' (as per all the literature) then you use a surrogate key. A surrogate key has no bearing on any real world attributes - it's there purely to enable joins between the fact and the dimension, enforce uniqueness, enable SCD's etc. its generally an incrementing integer (but don't rely on in being incremental in any logic)
MFR_ID and ITEM_ID are merely attributes of your dimension. I suggest you usea consistent prefix to indicate that a given  field is some kind of a key in a source table, i.e. SRC_
This would give you a table somehting like this:
SK_Item      - surrogate key which joins back to the fact
SRC_MFR_ID   - MFR_ID from the source system
SRC_ITEM_ID  -  ITEM_ID from the source system
SRC_SYSTEM - Some kind of descriptor for the actual source system
.... other attributes like name etc.
Can a manufacturer be included on a transaction regardless of the item purchased? If so then you need a seperate manufacturer dimension. If not then you don't need a seperate manufacturer dimension and you can include it on the item dimension. But it means if you are doing reports on manufacturers it is tied to products purchased.
If you add the manufacturer to the items dimension then by all means add manufacturer name to it.
SK_Item      - surrogate key which joins back to the fact
SRC_MFR_ID   - MFR_ID from the source system
SRC_ITEM_ID  -  ITEM_ID from the source system
SRC_SYSTEM - Some kind of descriptor for the actual source system
SRC_MFR_ID
Itemname
ItemGroup1
ItemGroup2
ItemGroup3
ItemGroup4
ManufacturerName
ManufacturerAddress
.... other attributes (including manufacturer and item grouping names)

Dimensional modelling is a lot about read performance. So you prejoin the manufacturer into the dimension but then you create an index on the manufacturer name so that if someone wants a list of manufacturers (SELECT DISTINCT ManufacturerNames FROM dimItem), it's nice and fast.
 

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Tim HumphriesDirectorCommented:
nmcdermaid: I agree entirely.
nmcdermaidCommented:
Reassuring to know :) Of course the final solution is dictated by the problem and the constraints, of which we don't know all, but its good to start with a 'theoretical' approach and work backwards as constraints are encountered.
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Project Management

From novice to tech pro — start learning today.