Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people, just like you, are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions

SSIS - Best way to accumulate data from several related tables before insertion into another database

Posted on 2010-11-08
Last Modified: 2012-05-10
I'm a newbie to SSIS.  I have to move data from a vendor supplied database to one that we will use for reporting.  The vendor database does not store all information at the transaction level that we will need to query.  For example: We have customers assigned to certain sales reps as their default sales rep.  The transaction data contains the actual sales rep that handled the transaction, but not the default sales rep for that customer.  Our sales managers need to run reports that show sales grouped by default sales rep as well as the person who handled the transaction.  They need these reports so that they can reassign the default sales rep periodically to balance work loads.

So, my problem is that I need to offload all transactional data from the vendor's database to a reporting database and capture various other pieces of information that would be a snapshot of the related tables' data at the time of the transaction. There are several pieces of data about both the item being sold and the customer that are not stored by the vendor's database as part of the transactional record that may change over time.  I cannot alter the vendor's database.  I must capture the value of these data elements on the day of the transaction.

I'm working in SSIS, and this is my first attempt at creating an SSIS package.  We are running SQL Server 2008.  Any guidance from the experts is appreciated.

Question by:Lynn Huff
  • 4
  • 4
LVL 16

Expert Comment

ID: 34087025
Using staging tables might be a good option

1. Execute SQL task -- create Staging tables
2. Data Flow Task - Load data into Staging Tables
3. Data Flow Task - Using SQL Server Destination write a SQL to join data from Staging Tables and load into destination
4. you can later schedule package to run daily using a Job

check the links below on how to use Temp tables/staging Tables in SSIS


Author Comment

by:Lynn Huff
ID: 34087197
I read through the articles you referenced, but the terminolgoy is throwing me a bit.  Let me try to express what I'm gathering and ask if this is right.

Step 1:  Create a staging table in the destination database with all of the appropriate columns to hold my accumulated data from the related tables in the source database.

Step 2:  Insert records into the staging table from the main source table.

Step 3:  Update the records in the staging table with all of their related table data from the source database.

Step 4:  Insert all of the records from the staging table into the destination table

Step 5:  Delete the records from the staging table

Is that an accurate description of the process?  I don't really see the advantage of using "temporary" tables as described in the articles, but I don't have any knowledge about temporary tables.  It seems to me that if the staging table is permanent but only has rows during the execution of the package, then this might be an acceptable solution.
LVL 16

Expert Comment

ID: 34087661
This is what i understood from your Actual Question..Correct me if i am wrong

you basically need to join data from multiple sources table on the fly before inserting data into destination tables..Is that right? If yes, you can do that using Temp tables.so that you can use SQL server to write your join statements and then insert them into destination

On the other hand if your source is a SQL server then you will not need staging tables as you can do the join directly in your Source

Please explain what you are trying to do in simple steps/words, i will try to help you do that in SSIS
Comprehensive Backup Solutions for Microsoft

Acronis protects the complete Microsoft technology stack: Windows Server, Windows PC, laptop and Surface data; Microsoft business applications; Microsoft Hyper-V; Azure VMs; Microsoft Windows Server 2016; Microsoft Exchange 2016 and SQL Server 2016.


Author Comment

by:Lynn Huff
ID: 34087779
I tried to explain that in my question.  I'll try again.

There is a database which I cannot alter that contains data that I need to move to another database on a daily basis as new rows appear.  For the most part, the source tables in DB1 and the destination tables in DB2 will have the same columns and data types.  BUT, there are some pieces of data related to each transaction that are only stored in related tables in the source database, DB1.  These data elements in the source database may change over time.  I need to capture their values at the time of the transaction and store those values in another database for reporting.

Example1:  A transaction header table in the source database has the customer number in it.  I also need to capture the customer class from the customer table and store that customer class as part of the transaction record in the destination database.

Example 2:  A transaction detail table in the source database has the item number in it.  I also need to capture the item category from the item table and store that item category as part of the transaction record in the destination database.

Does that make it clearer?

Author Comment

by:Lynn Huff
ID: 34087980
After re-reading your last post again, you said "you can do the join directly in your Source"

Are you suggesting creating a view in the source database to use as the source for the rows i need to insert into the destination database?
LVL 16

Accepted Solution

vdr1620 earned 500 total points
ID: 34088400
You don't necessarily need to create a view.. instead write the SQL directly in the source to get all the necessary data using joins.. then load the data into destination and I will believe you don't want to load all the data daily but just the new rows and changed data, which you can do by using Incremental Loads..

source--> lookup --> Conditional Split (if changed --update, if new - Insert) --> destination

hope you understand

Author Comment

by:Lynn Huff
ID: 34092733
I do understand.  I have one other question if you don't mind.  The lookup transformation will do what I need for identifying new rows.  However, I'm uncertain if the performance will become a problem since it actually compares every row in two tables.  In your experience, how big can a table get before the lookup transformation becomes a performance problem?
LVL 16

Expert Comment

ID: 34093395
Yes, performance will be a problem if its done in the right way... like have proper indexes on the search column and use the best mode suitable for you... Its a vast Topic, i would suggest you to read articles online to optimize performance while using lookups..below are few


There is also a 3rd party component.. I believe its called checksum transformation which calculates and stores checksum value and uses that for comparison..

Featured Post

Simplifying Server Workload Migrations

This use case outlines the migration challenges that organizations face and how the Acronis AnyData Engine supports physical-to-physical (P2P), physical-to-virtual (P2V), virtual to physical (V2P), and cross-virtual (V2V) migration scenarios to address these challenges.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

In this article I will describe the Copy Database Wizard method as one possible migration process and I will add the extra tasks needed for an upgrade when and where is applied so it will cover all.
Use this article to create a batch file to backup a Microsoft SQL Server database to a Windows folder.  The folder can be on the local hard drive or on a network share.  This batch file will query the SQL server to get the current date & time and wi…
This video shows, step by step, how to configure Oracle Heterogeneous Services via the Generic Gateway Agent in order to make a connection from an Oracle session and access a remote SQL Server database table.
Viewers will learn how to use the INSERT statement to insert data into their tables. It will also introduce the NULL statement, to show them what happens when no value is giving for any given column.

790 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question