• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 2834
  • Last Modified:

database replication vs. DTS / SSIS vs. ETL

We have significant infrastructure standards being changed.  We are standardizing on MS SQL Server 2005 and Oracle and for data movement, we are strongly favoring Informatica and eventually migrating to MS SSIS.  The questions that I have are specifically related to the data movement aspects of our network rebuild.  At this time, database replication is not even mentioned in our standards.  It is obvious that database replication is a viable and most appropriate solution for many systems.  How would this group recommend that I characterize the standard to identify when database replication would be used in lieu of DTS/SSIS or Informatica?  There are hundreds of servers and applications and data reliability and latency are frequently critical considerations.  The staff that supports the Informatica product indicates that they can maintain sub 3 second latency and full transactional consistency with very high data volumes.  I am concerned about the ability to maintain the low latency while data volumes increase significantly.  I also have concerns about the ability of the ETL solutions to guarantee delivery.  

Even in a distributed solution, we always want asynchronous data movement between systems due to the fact that a single server or network issue cannot take other systems offline (no two phase commits).

Consolidation and virtualization are key guidelines for the new infrastructure with no discussion on isolation and distribution across systems.  The solutions I am asking about would not include warehousing considerations.

Thank you!
0
BPAdba1
Asked:
BPAdba1
  • 3
  • 2
2 Solutions
 
nmcdermaidCommented:
ETL products allow you to transform data.

Replication allows you to copy chunks of data in their original format.


If you have no requirement to transform data (i.e. you have mentioned that this does not include data warehousing) then there is no reason to use ETL tools to replciate your data.


To implement a replication solution using replication tools is simple.

To implement a erplication system using ETL tools is complicated.
0
 
BPAdba1Author Commented:
Thank you for the response.  

I understand the basic concept of the 'T' in ETL meaning transformation.  ETL / DTS (SSIS) and other aftermarket tools provide some rather robust capabilities these days.  This is a politically motivated issue within our organization at this time and not necessarily driven by any basic conceptual guidelines.  I am hoping to get input from this group regarding performance differences / reliability issues (guaranteed or certified delivery) / network overhead considerations or any valid technical reason why we would want to strongly consider one way vs. the other options.  I am trying to show them that just because we can do it this way, that doesn't mean that we should chose that method of data movement.  I consider database replication a very reliable and manageable solution.  I am hoping to try to include some basic information about a distributed database environment for the sake of reliability, performance and scalability and want to find the words to encourage a change to our business standards to include database replication as the desired method to establish this model.

Thanks again,
Bill
0
 
nmcdermaidCommented:
Well, if you don't use the simplest solution for a problem (ie replication) then you are introducing unneeded complication and therefore cost.

$ always speaks volumes!

To me the issue is that as soon as you use a 'programming' type method (as in ETL), costs increase because

-you need to find a programmer to do the work
-its mor complicated to maintain

When you use a 'scripting' type method (as in replication) then you have to find a DBA, rather than a programmer. There's less complication, ambiguity, scope for error etc.
0
 
BPAdba1Author Commented:
This is assuming that everything else is equal.... when performing near realtime 'replication', is it realistic to assume that a DTS/SSIS or ETL (Informatica) solution can provide results equivalent to actual database replication?  I have quite a bit of replication experience, but no SSIS or ETL (Informatica) experience.  I am aware that Informatica does have a realtime replication product that they sell.... it is called PowerExchange.  That product actually does read the log.  

It seems like an ETL solution is reasonable if we are not talking about a significant number of tables and business requirements of latency are very lax.  It seems like an HA environment with distributed databases for performance and availability enhancement would most likely be most effectively implemented via database replication.  

As pointed out in the last statement above, having to work with developers for minor changes complicates the implementation of any changes to the system..... and in the event of problems, when a DBA is called for trouble analysis, it may very well require calling out a developer or ETL support staff for a single problem.

Thank you for the response!
Bill
0
 
BPAdba1Author Commented:
The responses are appreciated.  All answers are obviously correct, although they are not exactly what I was looking for.  Please close this question.  Thank you!
0

Featured Post

Free Tool: Path Explorer

An intuitive utility to help find the CSS path to UI elements on a webpage. These paths are used frequently in a variety of front-end development and QA automation tasks.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

  • 3
  • 2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now