Redundant SQL Server

Posted on 2006-04-08
Last Modified: 2008-02-01
Dear All,

I have an application (developed using that use MS SQL Server 2005 as the DBMS. One of the requirement is we have to ensure high availability of SQL Server. When one of the node is down, the secondary node should take over the role automatically. In addition, both of the database reside in different node should sychronize in real-time basics so that the data that i retrieve from the active node is the same. I think my application just have to connect to the virtual server whereby i do not care which node is active. How can i take advantage from feature of MS SQL Server 2005 to satisfy this requirement? Can anyone provide me a most cost effective solution to implement it? which edition of MS SQL Server 2005 and OS?

Any help is appreciated. Thanks a lot.
Question by:ahwind
    LVL 75

    Accepted Solution

    Hi ahwind,

    I think you can go for SQL server Clustering. The following link will help you
    LVL 2

    Assisted Solution

    If you look in the feature comparison under the High Availability section, you'll see the options you have by SQL Server edition.

    Clustering is generally the most expensive. It uses shared disks so exactly the same information is available in Node 2 if Node 1 fails. Because of this, the hardware basically needs to be exactly the same. You may be able to get away with slight variations but that can cause you problems troubleshooting issues. It can take 5 minutes or longer to failover from Node to Node. When a failover occurs the new node takes control of the shared disk and starts SQL Server. It starts in a typical way so you run the risk of a long recovery period if the failure happened when a lot of transactions were open.

    You can use log shipping to simulate what clustering does for the most part. The trade off is lag time though. You also have to build in your own failover mechanism.  Kind of hacky but you can get it to work. One of the benefits here is that nothing is shared so your hardware can be completely different. You could have an Intel 64-bit box on one node and a 32-bit AMD box on the other side. Since the disks aren't shared you have 100% redundancy. Both systems are live so you can't run into a long recovery mode scenario.

    If you can wait, Database Mirroring looks to be a great compromise between these two. It sort of works like logging shipping but MS takes care of all the hacky stuff including redirection. It also works in real time with only seconds to failover from node to node instead of minutes similar to Log Shipping. Here's a link that talks about it some. This is NOT production ready in SQL Server 2005 yet.

    LVL 3

    Assisted Solution

    Service Pack 1, and thus Database Mirroring, should be available (RTM) by the end of the month.  Given that anything you set up is going to take longer than that to test (and you can test it now with the SP1 CTP) you should fully consider it as one of your options.

    Here are the key tradeoffs in the three (really four) options:

    1) Completeness of product failover

    Both Database Mirroring and Log Shipping are DATABASE-level features, and more specifically Relational database features.  When they failover you don't bring along SQL Agent jobs, anything stored in a system database such as Master, etc.  And they don't help you with Analysis Services.  They have some enormous advantages, but to REALLY use either feature requires much more in the way of specific application design and possibly implementation (e.g., to work around the lack of SQL Agent failover).  Clustering is a more system-level failover that is completely transparent to the application.  Think of it as a very fast reboot where the entire node comes back online in just a few minutes.   So if you want to gain failover with just some infrastructure work and no application work, or you need features of SQL Server other than the relational engine to failover, then clustering is the best solution.

    2) Speed of failover

    There is a pretty huge difference in the failover speeds of the three options.  Database Mirroring is near instantaneous (i.e., a few seconds).  Clustering is a few minutes.  Log Shipping is indeterminate since it usually involves a manual failover decision.  Applications REALLY requiring near instantaneous failover are amongst the most likely to be willing to do significant application design and live with implementation restrictions in order to achieve it.  So the limitations of Database Mirroring are not a big issue for those apps.  But those who REALLY need near instantaneous failover are few and far between.  Of course, we all like it.  For most apps though, Clustering is more than sufficient in terms of taking a few minutes before service availability returns.

    3) Hardware/OS complexity and cost

    Database Mirroring and Log Shipping require no special hardware or OS editions.  You do need to be concious of the quality of your network.  Clustering requires very specific configurations, is very sensitive to the quality of things like disk controllers, and requires an Enterprise Edition of Windows Server.  People who do Clusters on the cheap (e.g., assemble them out of existing systems they have lying around) are rarely happy with the outcome and have done a lot to give Clusters a so-so reputation.  Basically, you need to buy a configuration that is well known to work well in this environment.

    4) Failure modes

    Clustering is primarily a solution for protecting against processor, and to a lesser degree, software failures.  It doesn't protect against data center outages, network outages, software or disk controller outages that corrupt disks, etc.  It relies on storage-level replication (e.g., RAID).  Database Mirroring and Log Shipping provide secondary copies of the data, potentially at geographically separate sites and thus can be used to protect against more failure modes than Clustering.  Indeed, the primary design point for Database Mirroring was to provide a completely synchronous (i.e., up-to-date with the last transaction) geographic disaster tolerance solution.  For example, a bank will have a datacenter in Manhattan and another over in New Jersey and if something happens to the Manhattan datacenter they want the New Jersey datacenter to pick up the work without losing any transactions (which can each be multi-Billion dollar) at all.  The most complete solution you could do would be to use Clustering to protect against local outages combined with Database Mirroring to provide geographic disaster tolerance.  You can also use Database Mirroring locally with Log Shipping for a lower level (i.e., not up to the last transaction) of geographic disaster tolerance.  With that said you can see that even used locally Database Mirroring and Log Shipping provide some additional failure mode protection.  For example, if a bad disk controller scrambles on a RAID set it doesn't impact a mirrored copy of the database.

    5) How many transactions can you lose?

    Clustering and Database Mirroring (set to be synchronous) are always up to date after a failure.  Log Shipping has windows in which transactions are not yet replicated to the backup node and thus you will lose transactions.

    6) Client Transparency

    Clustering is totally transparent to the client since it just looks like a fast reboot.  Database Mirroring is somewhat transparent IF you use SQL Native Client.  Log Shipping is not transparent at all since the client needs to know an alternate node to connect to.

    7) Resource utilization

    Database Mirroring allows the mirror copy to be used for reporting while.  The other two don't allow access to the database while a primary node is active.

    I think that is about it.  Now you just need to match these characteristics to your application, availability requirements, and budget and make the appropriate tradeoffs.

    LVL 3

    Expert Comment

    I forgot to mention why there are 4 options.  With Log Shipping you can either use what Microsoft provides or roll your own(RYO).  In SQL Server 2000 you would use the RYO option for Standard Edition.  Since Microsoft now includes Log Shipping in Standard Edition, RYO is now primarily something you would do with Express Edition.  Now personally I think the people costs of RYO log shipping would exceed the cost of WorkGroup.  Unless this is an ISV where the people costs are leveraged over a large number of installations.  
    LVL 37

    Assisted Solution

    by:Bing CISM / CISSP
    excellent job, hberenson. :)

    a few more words additionally, with MS SQL clustering and DB Mirroring solutions, you also need to double your cost on purchasing software licenses, including those for SQL and OS.

    so considering all the hberenson's comments in technical aspects and the related cost issues, if you must transparently and instantly switch the failed node, go SQL's HA solutions. otherwise, go SQL's sync solutions with manually switching the node to be accessed from the client side.

    hope it helps,
    LVL 3

    Expert Comment

    Thanks bbao.  But you do not need to double your software licensing costs with a SQL cluster.  SQL Server does NOT require a separate license for the passive node of a cluster.
    LVL 37

    Expert Comment

    by:Bing CISM / CISSP
    hi hberenson,

    you are right, NO need to buy a separate license for the PASSIVE node of an active/passive cluster. MS has introduced a processor-based licensing model for its .NET enterprise server products (including SQL2K/2K5):

    "Licensing for a Failover Cluster Configuration

    If your organization uses SQL Server 2000 in a failover cluster configuration, this means servers are clustered together to pick up each others' processing if one computer fails, in this situation, you have special licensing considerations. This option is only available in SQL Server 2000 Enterprise Edition.

    Failover clustering support can be configured two ways:

    * Active/Active. This option allows all servers in the failover cluster to regularly process information. When a server fails, one server or more takes on the additional workload of the failed server.

    * Active/Passive. This option is characterized by at least one server in the cluster that do not regularly process information, but waits to pick up the workload when an Active server fails.

    All Active servers in a cluster must be fully licensed, with either Processor Licenses or Server Licenses. However, if a server is strictly Passive, and works only when an Active server has failed, no additional licenses are needed for that Passive server. The exception to this is if the failover cluster is licensed under Processor License, and the number of processors on the Passive server exceeds the number of processors on the Active server. In this case, additional Processor licenses must be purchased for the additional processors on the Passive computer."
    LVL 3

    Expert Comment

    Yup.  I'm the one who came up with the "no need to license a passive cluster node" scheme in SQL Server 2000.

    Featured Post

    Highfive Gives IT Their Time Back

    Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

    Join & Write a Comment

    Introduction In my previous article ( I showed you how the XML Source component can be used to load XML files into a SQL Server database, us…
    International Data Corporation (IDC) prognosticates that before the current the year gets over disbursing on IT framework products to be sent in cloud environs will be $37.1B.
    This video shows, step by step, how to configure Oracle Heterogeneous Services via the Generic Gateway Agent in order to make a connection from an Oracle session and access a remote SQL Server database table.
    Via a live example, show how to extract insert data into a SQL Server database table using the Import/Export option and Bulk Insert.

    734 members asked questions and received personalized solutions in the past 7 days.

    Join the community of 500,000 technology professionals and ask your questions.

    Join & Ask a Question

    Need Help in Real-Time?

    Connect with top rated Experts

    23 Experts available now in Live!

    Get 1:1 Help Now