Solved

SSIS - Best way to accumulate data from several related tables before insertion into another database

Posted on 2010-11-08
8
567 Views
Last Modified: 2012-05-10
I'm a newbie to SSIS.  I have to move data from a vendor supplied database to one that we will use for reporting.  The vendor database does not store all information at the transaction level that we will need to query.  For example: We have customers assigned to certain sales reps as their default sales rep.  The transaction data contains the actual sales rep that handled the transaction, but not the default sales rep for that customer.  Our sales managers need to run reports that show sales grouped by default sales rep as well as the person who handled the transaction.  They need these reports so that they can reassign the default sales rep periodically to balance work loads.

So, my problem is that I need to offload all transactional data from the vendor's database to a reporting database and capture various other pieces of information that would be a snapshot of the related tables' data at the time of the transaction. There are several pieces of data about both the item being sold and the customer that are not stored by the vendor's database as part of the transactional record that may change over time.  I cannot alter the vendor's database.  I must capture the value of these data elements on the day of the transaction.

I'm working in SSIS, and this is my first attempt at creating an SSIS package.  We are running SQL Server 2008.  Any guidance from the experts is appreciated.

Thanks
0
Comment
Question by:Lynn Huff
  • 4
  • 4
8 Comments
 
LVL 16

Expert Comment

by:vdr1620
Comment Utility
Using staging tables might be a good option

1. Execute SQL task -- create Staging tables
2. Data Flow Task - Load data into Staging Tables
3. Data Flow Task - Using SQL Server Destination write a SQL to join data from Staging Tables and load into destination
4. you can later schedule package to run daily using a Job


check the links below on how to use Temp tables/staging Tables in SSIS

http://techyfreak.blogspot.com/2007/03/using-temp-tables-in-ssis-package.html
http://consultingblogs.emc.com/jamiethomson/archive/2006/11/19/SSIS_3A00_-Using-temporary-tables.aspx
0
 
LVL 3

Author Comment

by:Lynn Huff
Comment Utility
I read through the articles you referenced, but the terminolgoy is throwing me a bit.  Let me try to express what I'm gathering and ask if this is right.

Step 1:  Create a staging table in the destination database with all of the appropriate columns to hold my accumulated data from the related tables in the source database.

Step 2:  Insert records into the staging table from the main source table.

Step 3:  Update the records in the staging table with all of their related table data from the source database.

Step 4:  Insert all of the records from the staging table into the destination table

Step 5:  Delete the records from the staging table

Is that an accurate description of the process?  I don't really see the advantage of using "temporary" tables as described in the articles, but I don't have any knowledge about temporary tables.  It seems to me that if the staging table is permanent but only has rows during the execution of the package, then this might be an acceptable solution.
0
 
LVL 16

Expert Comment

by:vdr1620
Comment Utility
This is what i understood from your Actual Question..Correct me if i am wrong

you basically need to join data from multiple sources table on the fly before inserting data into destination tables..Is that right? If yes, you can do that using Temp tables.so that you can use SQL server to write your join statements and then insert them into destination

On the other hand if your source is a SQL server then you will not need staging tables as you can do the join directly in your Source

Please explain what you are trying to do in simple steps/words, i will try to help you do that in SSIS
0
 
LVL 3

Author Comment

by:Lynn Huff
Comment Utility
I tried to explain that in my question.  I'll try again.

There is a database which I cannot alter that contains data that I need to move to another database on a daily basis as new rows appear.  For the most part, the source tables in DB1 and the destination tables in DB2 will have the same columns and data types.  BUT, there are some pieces of data related to each transaction that are only stored in related tables in the source database, DB1.  These data elements in the source database may change over time.  I need to capture their values at the time of the transaction and store those values in another database for reporting.

Example1:  A transaction header table in the source database has the customer number in it.  I also need to capture the customer class from the customer table and store that customer class as part of the transaction record in the destination database.

Example 2:  A transaction detail table in the source database has the item number in it.  I also need to capture the item category from the item table and store that item category as part of the transaction record in the destination database.

Does that make it clearer?
0
Free Trending Threat Insights Every Day

Enhance your security with threat intelligence from the web. Get trending threat insights on hackers, exploits, and suspicious IP addresses delivered to your inbox with our free Cyber Daily.

 
LVL 3

Author Comment

by:Lynn Huff
Comment Utility
After re-reading your last post again, you said "you can do the join directly in your Source"

Are you suggesting creating a view in the source database to use as the source for the rows i need to insert into the destination database?
0
 
LVL 16

Accepted Solution

by:
vdr1620 earned 500 total points
Comment Utility
You don't necessarily need to create a view.. instead write the SQL directly in the source to get all the necessary data using joins.. then load the data into destination and I will believe you don't want to load all the data daily but just the new rows and changed data, which you can do by using Incremental Loads..

source--> lookup --> Conditional Split (if changed --update, if new - Insert) --> destination

hope you understand
0
 
LVL 3

Author Comment

by:Lynn Huff
Comment Utility
I do understand.  I have one other question if you don't mind.  The lookup transformation will do what I need for identifying new rows.  However, I'm uncertain if the performance will become a problem since it actually compares every row in two tables.  In your experience, how big can a table get before the lookup transformation becomes a performance problem?
0
 
LVL 16

Expert Comment

by:vdr1620
Comment Utility
Yes, performance will be a problem if its done in the right way... like have proper indexes on the search column and use the best mode suitable for you... Its a vast Topic, i would suggest you to read articles online to optimize performance while using lookups..below are few

http://technet.microsoft.com/en-us/library/cc966529.aspx
http://www.simple-talk.com/sql/ssis/sql-server-2005-ssis-tuning-the-dataflow-task/
http://www.networkworld.com/community/node/57997
http://social.msdn.microsoft.com/forums/en-US/sqlintegrationservices/thread/19fa2eed-119d-45d2-9a89-6abe618fff3b

There is also a 3rd party component.. I believe its called checksum transformation which calculates and stores checksum value and uses that for comparison..
0

Featured Post

Better Security Awareness With Threat Intelligence

See how one of the leading financial services organizations uses Recorded Future as part of a holistic threat intelligence program to promote security awareness and proactively and efficiently identify threats.

Join & Write a Comment

Suggested Solutions

In this article I will describe the Detach & Attach method as one possible migration process and I will add the extra tasks needed for an upgrade when and where is applied so it will cover all.
Ever wondered why sometimes your SQL Server is slow or unresponsive with connections spiking up but by the time you go in, all is well? The following article will show you how to install and configure a SQL job that will send you email alerts includ…
Familiarize people with the process of retrieving data from SQL Server using an Access pass-thru query. Microsoft Access is a very powerful client/server development tool. One of the ways that you can retrieve data from a SQL Server is by using a pa…
Viewers will learn how to use the SELECT statement in SQL and will be exposed to the many uses the SELECT statement has.

762 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

6 Experts available now in Live!

Get 1:1 Help Now