Solved

Solving a Cartesian scenario in the Fact table

Posted on 2013-11-20
12
506 Views
Last Modified: 2016-02-11
Hi,

  We have a DB2 OLTP environment which, we are planning to remodel to an OLAP in MS SQL Server 2008.

 In this, we have a scenario.
 Our fact table has Reference_ID, Contact_ID and Assignment_ID.
 Now, the relation is such that : Reference_ID to Contact_ID is 1--> N
                                                   Reference_ID to Assignment_ID is 1--> M
 and there is no direct relation between Contact_ID and Assignment_ID. So, we are getting a Cartesian result of M * N in our fact table.

How to solve this scenario and still keep the structure as a Star?

Please help.
0
Comment
Question by:pvsbandi
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 6
  • 6
12 Comments
 
LVL 45

Expert Comment

by:Kent Olsen
ID: 39662738
Hi pvsbandi,

How are you generating the rows that will go in the fact table?  Build the fact table correctly and any Cartesian should work itself out in the dimension table(s).


Kent
0
 

Author Comment

by:pvsbandi
ID: 39662931
This query is building the Fact table only.
 When doing so, we see this Cartesian happening.
 
Example:

[b][u]Contact Dimension[/u][/b]

Ref ID          Contact_ID
123               AAA
123               BBB

[b][u]Assignment Dimension[/u][/b]

Ref ID        Assignment_ID
123               XXX
123               YYY
123               ZZZ

[b][u]Reference Dimension[/u][/b]

Ref ID        
123

[b][u]Fact Table[/u][/b]

Ref ID              Contact ID               Assignment ID
123                     AAA                         XXX
123                     BBB                         XXX
123                     AAA                         YYY
123                     BBB                         YYY
123                     AAA                         ZZZ
123                     BBB                         ZZZ                        

Open in new window

0
 
LVL 45

Expert Comment

by:Kent Olsen
ID: 39662957
Still looks like a bad sub-query when you're building the fact table.

It appears that you're not just copying the OLTP generated transactions into the OLAP, you're modifying them by joining other data and selecting the desired columns.  The Cartesian is happening in the join of the other data.
0
MS Dynamics Made Instantly Simpler

Make Your Microsoft Dynamics Investment Count  & Drastically Decrease Training Time by Providing Intuitive Step-By-Step WalkThru Tutorials.

 

Author Comment

by:pvsbandi
ID: 39662962
Hi Kent,

   Yes. Because, we wanted the Fact table to host all this information of this granularity and thus, maintain the Star schema.

Is there a way to break this Cartesian?
0
 
LVL 45

Expert Comment

by:Kent Olsen
ID: 39662980
I'm sure that there is.  But without seeing the SQL there's not much help that I can offer other than to isolate where the Cartesian is occurring.
0
 

Author Comment

by:pvsbandi
ID: 39662991
Here is the SQL populating the fact. Problem is with the Assignment_ID and Contact_ID, which are not linked directly..so produce N * M cartesian.

SELECT          R.REFERRAL_ID,
                ASN.ASSIGNMENT_ID,
                ASN.ASSIGN_TO_STAFF_ID,
                ASN.START_DT AS ASSIGNMENT_START_DT, 
                ASN.END_DT AS ASSIGNMENT_END_DT, 
                RC.CLIENT_ID,
                CT.CONTACT_ID
FROM TB_REFERRAL R
INNER JOIN TB_ASSIGNMENT  ASN ON (R.REFERRAL_ID = ASN.ENTITY_KEY_ID)
LEFT OUTER JOIN TB_REFERRAL_CLIENTS RC ON (R.REFERRAL_ID = RC.REFERRAL_ID)
LEFT OUTER JOIN TB_CLIENT_ROLES CR ON (RC.REFERRAL_CLIENT_ID = CR.REFERRAL_CLIENT_ID)
LEFT OUTER JOIN 
FROM TB_CONTACT CT ON (R.REFERRAL_ID = CT.ENTITY_KEY_ID)
WHERE 
                R.DELETE_SW = 'N' 
                AND RC.DELETE_SW = 'N' 
                AND CR.DELETE_SW = 'N' 
                AND CR.ROLE_CD = '2085'

Open in new window

0
 
LVL 45

Expert Comment

by:Kent Olsen
ID: 39663123
I doubt that the assignment_join by ID is causing the Cartesian.  Intuitively, each assignment will have a unique ID, so unless the ID column isn't really and ID, there should be a duplicated ID value in tb_assignment.

Contact, however, looks like the real culprit.  We're walking through several other tables to join tb_contact.

tb_referral -> tb_referral_clients
tb_referral_clients -> tb_client_roles
tb_client_roles -> tb_contact

My best guess from the information at hand is that the first join works fine (1 to 1) but that clients can have multiple roles, so a Cartesian is generated when joining tb_client_roles.

Does that sound right to you?
0
 

Author Comment

by:pvsbandi
ID: 39663413
We have restricted the role to '2085'..so, no issue there. So is with other tables.


 But becasue, we can have multiple assignments for each referral.
Also can have multiple contacts for each referral.
So, for each referral, we get a cartesian between Assignment and Contact.
0
 
LVL 45

Expert Comment

by:Kent Olsen
ID: 39663525
Then the Cartesian is essentially correct.

It looks like your definition has the Referral/Assignment as the basic unit within the fact table.  

Because you have multiple rows making up the row that you're trying to store in the fact table, you need to aggregate those multiple rows, and lose the detail that makes up the aggregation, or modify your definition of the fact table to include all the rows of the Cartesian product.

My choice would probably be to avoid the aggregation and store all of the rows.  Leave the aggregation up to the Dimension tables.


Kent
0
 

Author Comment

by:pvsbandi
ID: 39663612
Thanks.. that's what we are ending up doing.. keeping the fact as is..
and adding indicators in the dimensions to identify the latest and earliest.


But i still feel that the fact table shouldn't have cartesian.. we can may still be able to accommodate a M X M join, but a join-less result would not be desirable.
0
 
LVL 45

Accepted Solution

by:
Kent Olsen earned 500 total points
ID: 39663640
Keep in mind that the Cartesian should produce only valid rows.  

There are currently multiple assignments per referral, so the fact table should be based on assignments and referrals identified by a referral dimension table.  If there are other items with a 1 to many relationship, they should probably be in a separate fact table.  It perfectly OK to have dimension tables that are appropriate for multiple fact tables.
0
 

Author Closing Comment

by:pvsbandi
ID: 39663881
Thanks, Kent!
0

Featured Post

On Demand Webinar - Networking for the Cloud Era

This webinar discusses:
-Common barriers companies experience when moving to the cloud
-How SD-WAN changes the way we look at networks
-Best practices customers should employ moving forward with cloud migration
-What happens behind the scenes of SteelConnect’s one-click button

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
MS SQL Sever Import/export problem 7 52
SQL Query Task 11 44
Return Rows as per Quantity of Columns Value In SQL 6 29
SQL question - help with insert for missing info 5 21
My client sends a request to me that they want me to load data, which will be returned by Web Service APIs, and do some transformation before importing to database. In this article, I will provide an approach to load data with Web Service Task and X…
Ever needed a SQL 2008 Database replicated/mirrored/log shipped on another server but you can't take the downtime inflicted by initial snapshot or disconnect while T-logs are restored or mirror applied? You can use SQL Server Initialize from Backup…
Via a live example, show how to setup several different housekeeping processes for a SQL Server.
Viewers will learn how to use the INSERT statement to insert data into their tables. It will also introduce the NULL statement, to show them what happens when no value is giving for any given column.

730 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question