Solved

Load fact table SSIS

Posted on 2015-02-09
4
85 Views
Last Modified: 2016-02-15
Im trying to load a Fact table from a Stage table in SQL Server. This is an existing package that is using the Slowly Changing Dimension which is incredibly slow. The process starts off with a Query to an ODBC table and then does many Lookups and then finally a SCD to Insert or Update the fact table. Now, I dont see how any of these records will be updated but just in case I want to check. I am not going to use the SCD to check though, it takes hours to complete. I was thinking about using a Conditional Split? My only problem is that there are no Primary keys in the source table so I have to use many fields to see if its an actual duplicate? I do see there is Line numbers that seem to be unique to the document number so maybe I can go off of that. Can someone help me out with the Conditional Split to check for a duplicate and to go to one OleDB destination for Insert and another for Update. This way I can evaluate if I need this. FYI, the Fact table has millions of rows so I dont think I should be doing any kind of Truncate and then reload the table. I just need a way to check the Destination before I do anything. Can anyone help me with this real quick???   Thank you!!!
0
Comment
Question by:jknj72
  • 2
  • 2
4 Comments
 
LVL 12

Expert Comment

by:Habib Pourfard
ID: 40598987
Slowly Changing Dimension transformation should only be used for small dimensions (less than 10,000 rows). comparing all the columns is not optimal. a hash function is usually used to store information about all the needed attributes, so that the process can quickly check whether there has been a change.

To implement a hash function Write a SELECT statement and add a calculated column using the T-SQL HASHBYTES
function when reading the data.

Faster Slowly Changing Dimension with Hash Function

and consider using T-SQL MERGE statement.
0
 

Author Comment

by:jknj72
ID: 40599026
So are you saying I should be using the SCD with a hash function to do what I need to do or should I be using a T-SQL Merge statement for what I need to do? I need help either way ;)

Thanks
JK
0
 
LVL 12

Accepted Solution

by:
Habib Pourfard earned 500 total points
ID: 40599334
Using Hash function is what will improve the performance. Use hash function to find out which records should be updated and which records need to be inserted.

then you can use MERGE statement if you like to insert and update in one statement.
0
 

Author Closing Comment

by:jknj72
ID: 40757388
Thanks
0

Featured Post

What Security Threats Are You Missing?

Enhance your security with threat intelligence from the web. Get trending threat insights on hackers, exploits, and suspicious IP addresses delivered to your inbox with our free Cyber Daily.

Join & Write a Comment

Suggested Solutions

My client has a dictionary table. They're defining a list of standard naming convention. Now, they are requiring my team to provide us a mechanism how to match new incoming data with existing data in their system.
For both online and offline retail, the cross-channel business is the most recent pattern in the B2C trade space.
Viewers will learn how to use the SELECT statement in SQL to return specific rows and columns, with various degrees of sorting and limits in place.
Viewers will learn how to use the SELECT statement in SQL and will be exposed to the many uses the SELECT statement has.

747 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

13 Experts available now in Live!

Get 1:1 Help Now