database replication vs. DTS / SSIS vs. ETL

Posted on 2006-03-29
Last Modified: 2008-01-16
We have significant infrastructure standards being changed.  We are standardizing on MS SQL Server 2005 and Oracle and for data movement, we are strongly favoring Informatica and eventually migrating to MS SSIS.  The questions that I have are specifically related to the data movement aspects of our network rebuild.  At this time, database replication is not even mentioned in our standards.  It is obvious that database replication is a viable and most appropriate solution for many systems.  How would this group recommend that I characterize the standard to identify when database replication would be used in lieu of DTS/SSIS or Informatica?  There are hundreds of servers and applications and data reliability and latency are frequently critical considerations.  The staff that supports the Informatica product indicates that they can maintain sub 3 second latency and full transactional consistency with very high data volumes.  I am concerned about the ability to maintain the low latency while data volumes increase significantly.  I also have concerns about the ability of the ETL solutions to guarantee delivery.  

Even in a distributed solution, we always want asynchronous data movement between systems due to the fact that a single server or network issue cannot take other systems offline (no two phase commits).

Consolidation and virtualization are key guidelines for the new infrastructure with no discussion on isolation and distribution across systems.  The solutions I am asking about would not include warehousing considerations.

Thank you!
Question by:BPAdba1
    LVL 30

    Assisted Solution

    ETL products allow you to transform data.

    Replication allows you to copy chunks of data in their original format.

    If you have no requirement to transform data (i.e. you have mentioned that this does not include data warehousing) then there is no reason to use ETL tools to replciate your data.

    To implement a replication solution using replication tools is simple.

    To implement a erplication system using ETL tools is complicated.

    Author Comment

    Thank you for the response.  

    I understand the basic concept of the 'T' in ETL meaning transformation.  ETL / DTS (SSIS) and other aftermarket tools provide some rather robust capabilities these days.  This is a politically motivated issue within our organization at this time and not necessarily driven by any basic conceptual guidelines.  I am hoping to get input from this group regarding performance differences / reliability issues (guaranteed or certified delivery) / network overhead considerations or any valid technical reason why we would want to strongly consider one way vs. the other options.  I am trying to show them that just because we can do it this way, that doesn't mean that we should chose that method of data movement.  I consider database replication a very reliable and manageable solution.  I am hoping to try to include some basic information about a distributed database environment for the sake of reliability, performance and scalability and want to find the words to encourage a change to our business standards to include database replication as the desired method to establish this model.

    Thanks again,
    LVL 30

    Accepted Solution

    Well, if you don't use the simplest solution for a problem (ie replication) then you are introducing unneeded complication and therefore cost.

    $ always speaks volumes!

    To me the issue is that as soon as you use a 'programming' type method (as in ETL), costs increase because

    -you need to find a programmer to do the work
    -its mor complicated to maintain

    When you use a 'scripting' type method (as in replication) then you have to find a DBA, rather than a programmer. There's less complication, ambiguity, scope for error etc.

    Author Comment

    This is assuming that everything else is equal.... when performing near realtime 'replication', is it realistic to assume that a DTS/SSIS or ETL (Informatica) solution can provide results equivalent to actual database replication?  I have quite a bit of replication experience, but no SSIS or ETL (Informatica) experience.  I am aware that Informatica does have a realtime replication product that they sell.... it is called PowerExchange.  That product actually does read the log.  

    It seems like an ETL solution is reasonable if we are not talking about a significant number of tables and business requirements of latency are very lax.  It seems like an HA environment with distributed databases for performance and availability enhancement would most likely be most effectively implemented via database replication.  

    As pointed out in the last statement above, having to work with developers for minor changes complicates the implementation of any changes to the system..... and in the event of problems, when a DBA is called for trouble analysis, it may very well require calling out a developer or ETL support staff for a single problem.

    Thank you for the response!

    Author Comment

    The responses are appreciated.  All answers are obviously correct, although they are not exactly what I was looking for.  Please close this question.  Thank you!

    Write Comment

    Please enter a first name

    Please enter a last name

    We will never share this with anyone.

    Featured Post

    Top 6 Sources for Identifying Threat Actor TTPs

    Understanding your enemy is essential. These six sources will help you identify the most popular threat actor tactics, techniques, and procedures (TTPs).

    Suggested Solutions

    Load balancing is the method of dividing the total amount of work performed by one computer between two or more computers. Its aim is to get more work done in the same amount of time, ensuring that all the users get served faster.
    International Data Corporation (IDC) prognosticates that before the current the year gets over disbursing on IT framework products to be sent in cloud environs will be $37.1B.
    Familiarize people with the process of retrieving data from SQL Server using an Access pass-thru query. Microsoft Access is a very powerful client/server development tool. One of the ways that you can retrieve data from a SQL Server is by using a pa…
    Via a live example, show how to extract insert data into a SQL Server database table using the Import/Export option and Bulk Insert.

    737 members asked questions and received personalized solutions in the past 7 days.

    Join the community of 500,000 technology professionals and ask your questions.

    Join & Ask a Question

    Need Help in Real-Time?

    Connect with top rated Experts

    15 Experts available now in Live!

    Get 1:1 Help Now