Solved

SSIS - Best way to accumulate data from several related tables before insertion into another database

Posted on 2010-11-08
8
578 Views
Last Modified: 2012-05-10
I'm a newbie to SSIS.  I have to move data from a vendor supplied database to one that we will use for reporting.  The vendor database does not store all information at the transaction level that we will need to query.  For example: We have customers assigned to certain sales reps as their default sales rep.  The transaction data contains the actual sales rep that handled the transaction, but not the default sales rep for that customer.  Our sales managers need to run reports that show sales grouped by default sales rep as well as the person who handled the transaction.  They need these reports so that they can reassign the default sales rep periodically to balance work loads.

So, my problem is that I need to offload all transactional data from the vendor's database to a reporting database and capture various other pieces of information that would be a snapshot of the related tables' data at the time of the transaction. There are several pieces of data about both the item being sold and the customer that are not stored by the vendor's database as part of the transactional record that may change over time.  I cannot alter the vendor's database.  I must capture the value of these data elements on the day of the transaction.

I'm working in SSIS, and this is my first attempt at creating an SSIS package.  We are running SQL Server 2008.  Any guidance from the experts is appreciated.

Thanks
0
Comment
Question by:Lynn Huff
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 4
  • 4
8 Comments
 
LVL 16

Expert Comment

by:vdr1620
ID: 34087025
Using staging tables might be a good option

1. Execute SQL task -- create Staging tables
2. Data Flow Task - Load data into Staging Tables
3. Data Flow Task - Using SQL Server Destination write a SQL to join data from Staging Tables and load into destination
4. you can later schedule package to run daily using a Job


check the links below on how to use Temp tables/staging Tables in SSIS

http://techyfreak.blogspot.com/2007/03/using-temp-tables-in-ssis-package.html
http://consultingblogs.emc.com/jamiethomson/archive/2006/11/19/SSIS_3A00_-Using-temporary-tables.aspx
0
 
LVL 3

Author Comment

by:Lynn Huff
ID: 34087197
I read through the articles you referenced, but the terminolgoy is throwing me a bit.  Let me try to express what I'm gathering and ask if this is right.

Step 1:  Create a staging table in the destination database with all of the appropriate columns to hold my accumulated data from the related tables in the source database.

Step 2:  Insert records into the staging table from the main source table.

Step 3:  Update the records in the staging table with all of their related table data from the source database.

Step 4:  Insert all of the records from the staging table into the destination table

Step 5:  Delete the records from the staging table

Is that an accurate description of the process?  I don't really see the advantage of using "temporary" tables as described in the articles, but I don't have any knowledge about temporary tables.  It seems to me that if the staging table is permanent but only has rows during the execution of the package, then this might be an acceptable solution.
0
 
LVL 16

Expert Comment

by:vdr1620
ID: 34087661
This is what i understood from your Actual Question..Correct me if i am wrong

you basically need to join data from multiple sources table on the fly before inserting data into destination tables..Is that right? If yes, you can do that using Temp tables.so that you can use SQL server to write your join statements and then insert them into destination

On the other hand if your source is a SQL server then you will not need staging tables as you can do the join directly in your Source

Please explain what you are trying to do in simple steps/words, i will try to help you do that in SSIS
0
Enterprise Mobility and BYOD For Dummies

Like “For Dummies” books, you can read this in whatever order you choose and learn about mobility and BYOD; and how to put a competitive mobile infrastructure in place. Developed for SMBs and large enterprises alike, you will find helpful use cases, planning, and implementation.

 
LVL 3

Author Comment

by:Lynn Huff
ID: 34087779
I tried to explain that in my question.  I'll try again.

There is a database which I cannot alter that contains data that I need to move to another database on a daily basis as new rows appear.  For the most part, the source tables in DB1 and the destination tables in DB2 will have the same columns and data types.  BUT, there are some pieces of data related to each transaction that are only stored in related tables in the source database, DB1.  These data elements in the source database may change over time.  I need to capture their values at the time of the transaction and store those values in another database for reporting.

Example1:  A transaction header table in the source database has the customer number in it.  I also need to capture the customer class from the customer table and store that customer class as part of the transaction record in the destination database.

Example 2:  A transaction detail table in the source database has the item number in it.  I also need to capture the item category from the item table and store that item category as part of the transaction record in the destination database.

Does that make it clearer?
0
 
LVL 3

Author Comment

by:Lynn Huff
ID: 34087980
After re-reading your last post again, you said "you can do the join directly in your Source"

Are you suggesting creating a view in the source database to use as the source for the rows i need to insert into the destination database?
0
 
LVL 16

Accepted Solution

by:
vdr1620 earned 500 total points
ID: 34088400
You don't necessarily need to create a view.. instead write the SQL directly in the source to get all the necessary data using joins.. then load the data into destination and I will believe you don't want to load all the data daily but just the new rows and changed data, which you can do by using Incremental Loads..

source--> lookup --> Conditional Split (if changed --update, if new - Insert) --> destination

hope you understand
0
 
LVL 3

Author Comment

by:Lynn Huff
ID: 34092733
I do understand.  I have one other question if you don't mind.  The lookup transformation will do what I need for identifying new rows.  However, I'm uncertain if the performance will become a problem since it actually compares every row in two tables.  In your experience, how big can a table get before the lookup transformation becomes a performance problem?
0
 
LVL 16

Expert Comment

by:vdr1620
ID: 34093395
Yes, performance will be a problem if its done in the right way... like have proper indexes on the search column and use the best mode suitable for you... Its a vast Topic, i would suggest you to read articles online to optimize performance while using lookups..below are few

http://technet.microsoft.com/en-us/library/cc966529.aspx
http://www.simple-talk.com/sql/ssis/sql-server-2005-ssis-tuning-the-dataflow-task/
http://www.networkworld.com/community/node/57997
http://social.msdn.microsoft.com/forums/en-US/sqlintegrationservices/thread/19fa2eed-119d-45d2-9a89-6abe618fff3b

There is also a 3rd party component.. I believe its called checksum transformation which calculates and stores checksum value and uses that for comparison..
0

Featured Post

What is SQL Server and how does it work?

The purpose of this paper is to provide you background on SQL Server. It’s your self-study guide for learning fundamentals. It includes both the history of SQL and its technical basics. Concepts and definitions will form the solid foundation of your future DBA expertise.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Why is this different from all of the other step by step guides?  Because I make a living as a DBA and not as a writer and I lived through this experience. Defining the name: When I talk to people they say different names on this subject stuff l…
In this article we will learn how to fix  “Cannot install SQL Server 2014 Service Pack 2: Unable to install windows installer msi file” error ?
Using examples as well as descriptions, and references to Books Online, show the documentation available for date manipulation functions and by using a select few of these functions, show how date based data can be manipulated with these functions.
Using examples as well as descriptions, and references to Books Online, show the documentation available for datatypes, explain the available data types and show how data can be passed into and out of variables.

756 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question