Solved

SSIS Best Practices

Posted on 2016-09-12
3
91 Views
Last Modified: 2016-09-13
Hi guys,

I am required to develop a data warehouse. As this would be my first project so i would need a bit of help from you guys.
Right! I have been reading about the good data warehousing practices. I have got few questions

i will be loading the data from three data sources so according to what i have read am i right in assuming that i will be pulling everything in my staging tables first?

Where would i handle the slowly changing dimensions in both staging and the data warehouse dimension tables?
How would i handle the incremental load? is it again same enabling CDC on OLTP and then in data warehouse?  

Kindest regards
0
Comment
Question by:shah36
3 Comments
 
LVL 65

Accepted Solution

by:
Jim Horn earned 250 total points
ID: 41794370
Hi shah

I really recommend breaking this up into multiple EE questions, as this one starts with 'I've never done this before' (not a criticism, we all started somewhere) and then goes to four pretty big design questions.

For some extra perspective check out Top 10 Ways to Ask Better Questions.

Thanks in advance.
Jim
0
 
LVL 43

Assisted Solution

by:zephyr_hex (Megan)
zephyr_hex (Megan) earned 250 total points
ID: 41794423
Jim is right -- if you want in-dept answers to your questions, it would be best to break them up into multiple EE questions.

To get you going in the right direction:

Staging Tables

Staging tables are optional.  It depends on your data and what you need to do to it before it's ready to be loaded in the final table.  I occasionally need to use a staging table in order to ease the process of getting the data into the right format.  But most often, SSIS tools are sufficient and I don't need to use a staging table.

One example of when I use a staging table is when I load data from a manufacturing monitoring system database.  That system records sensors every second, so there is a lot of data.  I don't need data down to the second in the data warehouse.  Moreover, I don't want to bring that huge data load into SSIS for processing.  So, I created a staging table.  I use a stored procedure to give me an average value of the sensor per minute, and then I point SSIS at that staging table, and do my processing on it before it's loaded to the table in the data warehouse.

Slowly Changing Dimensions
These should only be used on your final table in the data warehouse.  SCD's keep track of your data over time, so you have a record of when values changed.
SSIS has a built in SCD that is slow and difficult to maintain.  I've switched over to an add-in (that's faster and easier to maintain) : https://dimensionmergescd.codeplex.com/

There is a video for that tool (linked on the page above) that shows you how to use it.  The first several times I used the tool, I had to watch the video and go step by step through it.  Now I understand how it works and can get a SCD up and running in just a few minutes.

Incremental Load
In all honesty, I use the SCD tool to handle the incremental load (using the New Records detection), OR, I use a datetimestamp to track the last load.  However, neither of these procedures is the formal way to handle it.  So I can't give you any specific assistance in that regard.
0
 

Author Closing Comment

by:shah36
ID: 41796036
Thanks guys i will post separate EE questions.

regards
0

Featured Post

Free learning courses: Active Directory Deep Dive

Get a firm grasp on your IT environment when you learn Active Directory best practices with Veeam! Watch all, or choose any amount, of this three-part webinar series to improve your skills. From the basics to virtualization and backup, we got you covered.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Why is this different from all of the other step by step guides?  Because I make a living as a DBA and not as a writer and I lived through this experience. Defining the name: When I talk to people they say different names on this subject stuff l…
My client sends a request to me that they want me to load data, which will be returned by Web Service APIs, and do some transformation before importing to database. In this article, I will provide an approach to load data with Web Service Task and X…
Viewers will learn how the fundamental information of how to create a table.
Viewers will learn how to use the SELECT statement in SQL to return specific rows and columns, with various degrees of sorting and limits in place.

821 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question