Linux Wan Based Failover

Posted on 2011-10-05
Last Modified: 2012-05-12

I've been running into a problem. I need to setup a failover system across multiple public ip address. I know this requires dns changes. Since i knew that from the beginning of this project I ensured the domain we use was registered with a Dynamic DNS Provider. Now our service needs to provide a very reliable up time so the server themselves are sitting in different data centers on opposite sides of the country.

Now last I knew Linux-HA did not support WAN and I need a heartbeat monitor for Apache and MySQL that functions on WAN. I would greatly appreciate any advise or insight into this problem.
Question by:Pyromanci
    LVL 39

    Expert Comment

    Linux-HA assumes that the heartbeat interface is immediately connected to the other system. So interface down actualy means other node is down.
    As soon as you insert a switch in between there is a problem if the Heartbeat switch goes down, both systems still continue to work (thinking the other is down) This is called a split-brain issue.

    This can be solved but you have to look into a different venue. You need some 3rd system that controbutes a vote to your cluster. (RHEL/CentOS based cluster, using DLM...) then you can use pacemaker to manage the load when needed.
    Here there is no assumption of that connection where you can see the other system is actualy down.
    If you want a really bulltproof solution checkout OpenVMS.
    LVL 43

    Expert Comment

    You definitively will not do that without third system.
    And BTW: how do You synchronize databases?
    LVL 39

    Expert Comment

    Like i said, OpenVMS does this trick allready >25 years also long distance so nothing realy new there.

    I known linux is fresh into this kind of cluster business,
    A GFS shared disk might be needed to share that database, on mirrored devices over all locations.
    LVL 5

    Author Comment

    Sorry for the late reply, became really busy here.
    The databases are synced with master to master replication. the sync it's self is check via a script I wrote that runs at the end of the day to validate information between the 2.

    The split brain issue actually is not a concern for me. When i have used HA in the past i've used it inconjuction with DRBD and I had it set to never switch back over if the primary node came back online. reason being was I had to let DRBD get caught up on the master and doing some validation checking on it. then I would manually tell it take over out side of business hours.

    See the problem we have right now is every now or then 1 of 2 things will happen. A). The iptables on the machine become overloaded and lock up (this is due to heavy hacking attempt traffic that just overload the nic). B). Our current host provider has a issue with their network at the data center (this is not their fault it's a issue with their backbone provider and they working on resolving issues, though the cause is unknown at the moment).

    So when one of those things happen. Typically I go through and do the DNS change to point the secondary server. though this could be anywhere from 10minutes to a hour after the problem has occurred.
    LVL 39

    Accepted Solution

    you can look into somethng call the wondershaper from the

    You can try to prioritize your inter site traffic in the hope it still passes.
    Even better have a separate link to use as intersite connection.
    It's not the HA-linux approach... but is is the closest match...

    Those extra links should not be used for ANYTHING else.
    LVL 5

    Author Closing Comment

    Wasn't a complete solution, but pointed me the direction i needed to go.

    Featured Post

    How to run any project with ease

    Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
    - Combine task lists, docs, spreadsheets, and chat in one
    - View and edit from mobile/offline
    - Cut down on emails

    Join & Write a Comment

    Network Interface Card (NIC) bonding, also known as link aggregation, NIC teaming and trunking, is an important concept to understand and implement in any environment where high availability is of concern. Using this feature, a server administrator …
    Little introduction about CP: CP is a command on linux that use to copy files and folder from one location to another location. Example usage of CP as follow: cp /myfoder /pathto/destination/folder/ cp abc.tar.gz /pathto/destination/folder/ab…
    Learn several ways to interact with files and get file information from the bash shell. ls lists the contents of a directory: Using the -a flag displays hidden files: Using the -l flag formats the output in a long list: The file command gives us mor…
    Learn how to find files with the shell using the find and locate commands. Use locate to find a needle in a haystack.: With locate, check if the file still exists.: Use find to get the actual location of the file.:

    728 members asked questions and received personalized solutions in the past 7 days.

    Join the community of 500,000 technology professionals and ask your questions.

    Join & Ask a Question

    Need Help in Real-Time?

    Connect with top rated Experts

    24 Experts available now in Live!

    Get 1:1 Help Now