SSIS - Loading 20 tables

Posted on 2014-07-13
Last Modified: 2016-02-11
I was asked a solution for the following question. I'm not very good at SSIS.  Need your advice

Source data from  Oracle contains 20 tables which are directly fed into SQL Server Tables using OLEDB Fast load option.
The issue is that the batch configured fails due to  referentrial integrity issues (no primary key ) for child tables and duplicate issues.

I suggested them to disable the constraints in the sql server tables. But once the data is loaded  into the destination and enabling constraints again, it would fail anyways due to constraint violation and duplicates.

What is the best way to implement these kind of scenarios. Should we create  a Staging Database . Firstly  load all the 20 tables into staging and then use set based commands. How about error handling in duplicates and constraint violations . Please  keep in mind these tables source contain huge amount of data ie 20 million I suppose. So using Sort etc transformations would be very  slow.

Can a detailed example be given for me to easily understand.
Question by:BI 0824
    LVL 13

    Accepted Solution

    If you need to enforce referential integrity, i.e., you can delete the records that do not have a corresponding entry in the main table, then you can remove the constraints, import all data, delete records that will cause it to fail, and then reinstate the constraints.

    If you need to keep the unrelated items, you can create "dummy" master records instead of deleting them.
    LVL 65

    Assisted Solution

    by:Jim Horn
    There are a number of things fundamentally wrong with this approach.


    Data is being loaded directly into production tables.  If that data is bad, then you've now introduced quality issues into your production tables.


    >I suggested them to disable the constraints in the sql server tables.
    That would work, but what is missing is how to handle any rows that have foreign key violations.  Delete those rows?  Assign the FK to a 'dummy' value in another table so the FK works?  Insert them into a 'temp' table for further action, then delete?

    Otherwise, when the constraints are re-enabled, that code will fail because of the violations.  Now your production data has quality issues AND queries run very slow because there is no foreign keys.  
    >Should we create  a Staging Database .
    Staging tables, YES.  Staging separate database, not sure that matters.

    >How about error handling in duplicates and constraint violations.
    Depends on what the business wants.  Not for us to give one answer and state that is is correct in all situations.  At this stage in the game, it would be an excellent idea to get the full scope of problems with this source data, and then come up with a solution.

    Typically I pump data into a staging table with two extra columns:  validation_errors (tinyint) and validation_message (varchar).  

    Then I have an SP for each table that executes code that validates the staging tables so that that dates are dates, numbers are numbers, FK's are valid, whatever the needs are.  If any of these validations fail, it sets validation_errors = validation_erros + 1, and writes to validation_message an English-sounding error message.

    Then you pump ONLY the rows where validation_errors = 0 into the production table.

    Then you handle the rows where validation_errors > 1 in whatever way your business wants them handled.

    >Please  keep in mind these tables source contain huge amount of data ie 20 million I suppose.
    Excellent ideas here would be to either impliment Change Data Capture in the SSIS package, or contact the source of this data and ask if they would entertain sending you only change files, instead of full load files.  Gotta question the wisdom of sending 20 million rows every day (month, whatever) if only a small subset of them change.

    Author Comment

    by:BI 0824
    Thanks guys. Once staging environment is created. How do I load all these PK and FK tables.

    What I'm looking for is the best scenario of using SSIS tasks. For eg :  - I use OLEDB source (Oracle) and OLEDB destination (staging table). Lets say  there are 4 tables  . A has relationship with B. B has with C and C has with D
    When loading A,B,C,D (staging tables) what is the flow. What are the ssis tasks I need to use considering the volume of data.
    As per what you had mentioned, 2 columns are created and when FK or Duplicate error you insert -1 .

    I just want to know the flow of the data. What components we got to use in SSIS and how to log if errors in those 2 columns?

    Author Closing Comment

    by:BI 0824
    Sorry about the late reply. Thanks

    Write Comment

    Please enter a first name

    Please enter a last name

    We will never share this with anyone.

    Featured Post

    What Should I Do With This Threat Intelligence?

    Are you wondering if you actually need threat intelligence? The answer is yes. We explain the basics for creating useful threat intelligence.

    Suggested Solutions

    SQL Server  2012 Release with lots of Enhancements in Database Engine functions, SSIS, SSRS and some of new services like Data Quality Server and Master Data Service. Of particular interest, and the focus of this Article is SSIS. So, time to elab…
    Slowly Changing Dimension Transformation component in data task flow is very useful for us to manage and control how data changes in SSIS.
    Migrating to Microsoft Office 365 is becoming increasingly popular for organizations both large and small. If you have made the leap to Microsoft’s cloud platform, you know that you will need to create a corporate email signature for your Office 365…
    In this sixth video of the Xpdf series, we discuss and demonstrate the PDFtoPNG utility, which converts a multi-page PDF file to separate color, grayscale, or monochrome PNG files, creating one PNG file for each page in the PDF. It does this via a c…

    779 members asked questions and received personalized solutions in the past 7 days.

    Join the community of 500,000 technology professionals and ask your questions.

    Join & Ask a Question

    Need Help in Real-Time?

    Connect with top rated Experts

    18 Experts available now in Live!

    Get 1:1 Help Now