SSIS Design Guidelines and Suggestions

So I am designated at putting together an SSIS package to convert data between an Oracle View to a Sql Server Table.

I was looking for some design tips to put this thing together really well.  I will explain the high level of the tasks.  


So first this is an Oracle view ...I access it on my machine through an ODBC connection.  I am creating an OLD DB Source in the SSIS package using the Oracle 11g Drivers.  

I need to create a table in SQL Server so that nightly a job can run that basically takes the data from the Oracle view....and dumps it into a sql table....Each night I need this job to run and basically update the information.  

Once this table is in SQL ..stored procs will run against that table which will update tables in my SQL database.

However I would think I would want to do some type of comparison between the Oracle View and the table its being dumped into ...like a merge....so that only certain data that has changed would change my master import table.


Any suggestions on the best way to set this up.   I am thinking I will Need an OLe DB Source for the Oracle View...and SQL Server Destination ...but I would think I need something else in between so its not a complete dump.....

How would I manage this?


Thanks
LVL 11
Robb HillSenior .Net DeveloperAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Jim HornMicrosoft SQL Server Data DudeCommented:
Couple of questions..

>that basically takes the data from the Oracle view....and dumps it into a sql table
How many rows are we talking here?  Does this data need to be validated / scrubbed before writing to SQL tables?

For starters, create the table in SQL with a prefix like 'ssis_', 'staging_', or whatever floats your boat, then the SSIS will do a SQL Task to DELETE all from that table, then a data flow to INSERT.

>I would think I would want to do some type of comparison between the Oracle View and the table its being dumped into
Explain in greater detail what this means.  Do you only need a subset of the Oracle view, i.e. the data that changed?   If yes, a better idea would be to create a new Oracle view to pick off only the changed data, that way we're not pumping data that is ultimately not used.
Robb HillSenior .Net DeveloperAuthor Commented:
So first time ever the data is dumped ...it will just be a dump of all the data from the view....so that its in a SQL Server format.

I would think rather than each day ...truncating the sql table and redumping...that it would make sense to just check for changes in records or new or removed records.....if this was sql only you would do this with merge....not sure how to do this in SSIS.   Ultimately I can truncate and insert each day...that just seemed like that would be less effecient.
Jim HornMicrosoft SQL Server Data DudeCommented:
>that it would make sense to just check for changes in records or new or removed records
Okay.  For starters, define how you would know if a row changed.  Are there auditing columns such as created_dt, last_updated_dt?  If yes that would be easy, just filter the data pump based on a WHERE {that column} > {the last time the SSIS package ran}.

Either way, before we write a book here, tell us how..
Determine the Perfect Price for Your IT Services

Do you wonder if your IT business is truly profitable or if you should raise your prices? Learn how to calculate your overhead burden with our free interactive tool and use it to determine the right price for your IT services. Download your free eBook now!

Vitor MontalvãoMSSQL Senior EngineerCommented:
Do you have any date column in the Oracle side? So you can work with a subset of data instead of full View,
Something like:
SELECT *
FROM YourViewName
WHERE DateField > SYSDATE - 1

Open in new window

Robb HillSenior .Net DeveloperAuthor Commented:
Ok first...I only have the data in the Oracle datasource.  I keep creating a sql server destination and the data seems mapped..but nothing is moving to the other source.  

If we can work that issue first...just on how I get the data moved to sql from Oracle..that would help alot.

Then I will create a phony data file so you can see the data that I am trying to manipulate each day.


Thanks in advance.
Vitor MontalvãoMSSQL Senior EngineerCommented:
The easiest way for you to do this is using SQL Server Management Studio, connect to your destination database and right-click on the database name, chose Tasks / Import Wizard and follow the wizard and in the end save it as a SSIS package so you can use it in SSIS.

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Robb HillSenior .Net DeveloperAuthor Commented:
Ok Vitor..let me try that...I did not think that could be done when doing Oracle to Sql Server
Vitor MontalvãoMSSQL Senior EngineerCommented:
If the Oracle client it's installed on the SQL Server machine you can do it.
Robb HillSenior .Net DeveloperAuthor Commented:
Ok I have the table created.  I am putting together a mock table to discuss how to transform this on the nightly job.

I will post back in the next hour.
Robb HillSenior .Net DeveloperAuthor Commented:
Ok so lets imagine the following table.

Lets say its a 6 column table.

The first column is the Primary Key ...lets call it Account Number.   So if there was a new record I would insert...if it was missing ...then rather than delete I would want to update a column in the table ...we can call it expire as no record will ever be deleted.

Lets say we also have a column in the table called department.  If this column changes then I would need to flag another column as true for department change. and update the olddepartment with the current value and update the department column with the new value.

Then if any column has changed the record would need to be updated but this updating cannot mess up the process with the potential for the department  value to be changed.


Any ideas how to do this
slightwv (䄆 Netminder) Commented:
You 'might' be able to write your own using a lot of code and triggers.

My suggestion is buy a product that does it for you.

Oracle GoldenGate for example:
http://www.oracle.com/technetwork/articles/datawarehouse/oracle-sqlserver-goldengate-460262.html

There are likely other products out there as well.
Vitor MontalvãoMSSQL Senior EngineerCommented:
Wouldn't be better to create a Transactional Replication between those databases?
Here's the MSDN article to show how to configure it.
Robb HillSenior .Net DeveloperAuthor Commented:
wouldnt this be better served to run inside my ssis package...some type of transfomation..or stored procs that run as the data is being loaded?  Even if I had two sql tables...one master import table and a temp table that essentially were sql server tables ....
Vitor MontalvãoMSSQL Senior EngineerCommented:
If there's already this functionality why should you bother in creating one?
Robb HillSenior .Net DeveloperAuthor Commented:
I cannot do the oracle publication
Robb HillSenior .Net DeveloperAuthor Commented:
I only have access to an Oracle view......I have to pull this data nightly into sql server so it can be maintained for another application that runs on sql server backend.
Vitor MontalvãoMSSQL Senior EngineerCommented:
Ok. Then I understand why you can't create a replication but the process you want to build would be somehow slow because it needs to check row by row and column by column.
slightwv (䄆 Netminder) Commented:
I'm an Oracle person and really don't even know how to spell SQL Server but I think the theory is still the same.

Pros: should be faster to locate 'differences'
Cons: Twice the space for the staging table

Given your restrictions how about this:
Have two staging tables.
LAST_PULL
CURRENT_PULL

-Create CURRENT_PULL from the Oracle view using CTAS.
-Do a MINUS query between LAST_PULL and CURRENT_PULL (then CURRENT_PULL minus LAST_PULL to get deleted values) to get the 'differences'.
-Process only the differences.
-Drop LAST_PULL
-Rename CURRENT_PULL to LAST_PULL
Robb HillSenior .Net DeveloperAuthor Commented:
This worked the easiest for me:)
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Microsoft SQL Server 2008

From novice to tech pro — start learning today.