SolvedPrivate

SSIS Design Guidelines and Suggestions

Posted on 2014-10-22
19
23 Views
Last Modified: 2016-02-11
So I am designated at putting together an SSIS package to convert data between an Oracle View to a Sql Server Table.

I was looking for some design tips to put this thing together really well.  I will explain the high level of the tasks.  


So first this is an Oracle view ...I access it on my machine through an ODBC connection.  I am creating an OLD DB Source in the SSIS package using the Oracle 11g Drivers.  

I need to create a table in SQL Server so that nightly a job can run that basically takes the data from the Oracle view....and dumps it into a sql table....Each night I need this job to run and basically update the information.  

Once this table is in SQL ..stored procs will run against that table which will update tables in my SQL database.

However I would think I would want to do some type of comparison between the Oracle View and the table its being dumped into ...like a merge....so that only certain data that has changed would change my master import table.


Any suggestions on the best way to set this up.   I am thinking I will Need an OLe DB Source for the Oracle View...and SQL Server Destination ...but I would think I need something else in between so its not a complete dump.....

How would I manage this?


Thanks
0
Comment
Question by:Robb Hill
  • 9
  • 6
  • 2
  • +1
19 Comments
 
LVL 65

Expert Comment

by:Jim Horn
ID: 40396925
Couple of questions..

>that basically takes the data from the Oracle view....and dumps it into a sql table
How many rows are we talking here?  Does this data need to be validated / scrubbed before writing to SQL tables?

For starters, create the table in SQL with a prefix like 'ssis_', 'staging_', or whatever floats your boat, then the SSIS will do a SQL Task to DELETE all from that table, then a data flow to INSERT.

>I would think I would want to do some type of comparison between the Oracle View and the table its being dumped into
Explain in greater detail what this means.  Do you only need a subset of the Oracle view, i.e. the data that changed?   If yes, a better idea would be to create a new Oracle view to pick off only the changed data, that way we're not pumping data that is ultimately not used.
0
 
LVL 11

Author Comment

by:Robb Hill
ID: 40397248
So first time ever the data is dumped ...it will just be a dump of all the data from the view....so that its in a SQL Server format.

I would think rather than each day ...truncating the sql table and redumping...that it would make sense to just check for changes in records or new or removed records.....if this was sql only you would do this with merge....not sure how to do this in SSIS.   Ultimately I can truncate and insert each day...that just seemed like that would be less effecient.
0
 
LVL 65

Expert Comment

by:Jim Horn
ID: 40399404
>that it would make sense to just check for changes in records or new or removed records
Okay.  For starters, define how you would know if a row changed.  Are there auditing columns such as created_dt, last_updated_dt?  If yes that would be easy, just filter the data pump based on a WHERE {that column} > {the last time the SSIS package ran}.

Either way, before we write a book here, tell us how..
0
Optimizing Cloud Backup for Low Bandwidth

With cloud storage prices going down a growing number of SMBs start to use it for backup storage. Unfortunately, business data volume rarely fits the average Internet speed. This article provides an overview of main Internet speed challenges and reveals backup best practices.

 
LVL 47

Expert Comment

by:Vitor Montalvão
ID: 40399411
Do you have any date column in the Oracle side? So you can work with a subset of data instead of full View,
Something like:
SELECT *
FROM YourViewName
WHERE DateField > SYSDATE - 1

Open in new window

0
 
LVL 11

Author Comment

by:Robb Hill
ID: 40399412
Ok first...I only have the data in the Oracle datasource.  I keep creating a sql server destination and the data seems mapped..but nothing is moving to the other source.  

If we can work that issue first...just on how I get the data moved to sql from Oracle..that would help alot.

Then I will create a phony data file so you can see the data that I am trying to manipulate each day.


Thanks in advance.
0
 
LVL 47

Accepted Solution

by:
Vitor Montalvão earned 500 total points
ID: 40399422
The easiest way for you to do this is using SQL Server Management Studio, connect to your destination database and right-click on the database name, chose Tasks / Import Wizard and follow the wizard and in the end save it as a SSIS package so you can use it in SSIS.
0
 
LVL 11

Author Comment

by:Robb Hill
ID: 40399426
Ok Vitor..let me try that...I did not think that could be done when doing Oracle to Sql Server
0
 
LVL 47

Expert Comment

by:Vitor Montalvão
ID: 40399429
If the Oracle client it's installed on the SQL Server machine you can do it.
0
 
LVL 11

Author Comment

by:Robb Hill
ID: 40399718
Ok I have the table created.  I am putting together a mock table to discuss how to transform this on the nightly job.

I will post back in the next hour.
0
 
LVL 11

Author Comment

by:Robb Hill
ID: 40401935
Ok so lets imagine the following table.

Lets say its a 6 column table.

The first column is the Primary Key ...lets call it Account Number.   So if there was a new record I would insert...if it was missing ...then rather than delete I would want to update a column in the table ...we can call it expire as no record will ever be deleted.

Lets say we also have a column in the table called department.  If this column changes then I would need to flag another column as true for department change. and update the olddepartment with the current value and update the department column with the new value.

Then if any column has changed the record would need to be updated but this updating cannot mess up the process with the potential for the department  value to be changed.


Any ideas how to do this
0
 
LVL 76

Expert Comment

by:slightwv (䄆 Netminder)
ID: 40401959
You 'might' be able to write your own using a lot of code and triggers.

My suggestion is buy a product that does it for you.

Oracle GoldenGate for example:
http://www.oracle.com/technetwork/articles/datawarehouse/oracle-sqlserver-goldengate-460262.html

There are likely other products out there as well.
0
 
LVL 47

Expert Comment

by:Vitor Montalvão
ID: 40401982
Wouldn't be better to create a Transactional Replication between those databases?
Here's the MSDN article to show how to configure it.
0
 
LVL 11

Author Comment

by:Robb Hill
ID: 40402017
wouldnt this be better served to run inside my ssis package...some type of transfomation..or stored procs that run as the data is being loaded?  Even if I had two sql tables...one master import table and a temp table that essentially were sql server tables ....
0
 
LVL 47

Expert Comment

by:Vitor Montalvão
ID: 40402031
If there's already this functionality why should you bother in creating one?
0
 
LVL 11

Author Comment

by:Robb Hill
ID: 40402070
I cannot do the oracle publication
0
 
LVL 11

Author Comment

by:Robb Hill
ID: 40402071
I only have access to an Oracle view......I have to pull this data nightly into sql server so it can be maintained for another application that runs on sql server backend.
0
 
LVL 47

Expert Comment

by:Vitor Montalvão
ID: 40402286
Ok. Then I understand why you can't create a replication but the process you want to build would be somehow slow because it needs to check row by row and column by column.
0
 
LVL 76

Expert Comment

by:slightwv (䄆 Netminder)
ID: 40402414
I'm an Oracle person and really don't even know how to spell SQL Server but I think the theory is still the same.

Pros: should be faster to locate 'differences'
Cons: Twice the space for the staging table

Given your restrictions how about this:
Have two staging tables.
LAST_PULL
CURRENT_PULL

-Create CURRENT_PULL from the Oracle view using CTAS.
-Do a MINUS query between LAST_PULL and CURRENT_PULL (then CURRENT_PULL minus LAST_PULL to get deleted values) to get the 'differences'.
-Process only the differences.
-Drop LAST_PULL
-Rename CURRENT_PULL to LAST_PULL
0
 
LVL 11

Author Closing Comment

by:Robb Hill
ID: 40411272
This worked the easiest for me:)
0

Featured Post

Master Your Team's Linux and Cloud Stack!

The average business loses $13.5M per year to ineffective training (per 1,000 employees). Keep ahead of the competition and combine in-person quality with online cost and flexibility by training with Linux Academy.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

My client sends data in an Excel file to me to load them into Staging database. The file contains many sheets that they have same structure. In this article, I would like to share the simple way to load data of multiple sheets by using SSIS.
In this article we will get to know that how can we recover deleted data if it happens accidently. We really can recover deleted rows if we know the time when data is deleted by using the transaction log.
Via a live example, show how to take different types of Oracle backups using RMAN.
This video shows how to Export data from an Oracle database using the Datapump Export Utility.  The corresponding Datapump Import utility is also discussed and demonstrated.

803 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question