Linux Wan Based Failover


I've been running into a problem. I need to setup a failover system across multiple public ip address. I know this requires dns changes. Since i knew that from the beginning of this project I ensured the domain we use was registered with a Dynamic DNS Provider. Now our service needs to provide a very reliable up time so the server themselves are sitting in different data centers on opposite sides of the country.

Now last I knew Linux-HA did not support WAN and I need a heartbeat monitor for Apache and MySQL that functions on WAN. I would greatly appreciate any advise or insight into this problem.
Who is Participating?
nociSoftware EngineerCommented:
you can look into somethng call the wondershaper from the

You can try to prioritize your inter site traffic in the hope it still passes.
Even better have a separate link to use as intersite connection.
It's not the HA-linux approach... but is is the closest match...

Those extra links should not be used for ANYTHING else.
nociSoftware EngineerCommented:
Linux-HA assumes that the heartbeat interface is immediately connected to the other system. So interface down actualy means other node is down.
As soon as you insert a switch in between there is a problem if the Heartbeat switch goes down, both systems still continue to work (thinking the other is down) This is called a split-brain issue.

This can be solved but you have to look into a different venue. You need some 3rd system that controbutes a vote to your cluster. (RHEL/CentOS based cluster, using DLM...) then you can use pacemaker to manage the load when needed.
Here there is no assumption of that connection where you can see the other system is actualy down.
If you want a really bulltproof solution checkout OpenVMS.
You definitively will not do that without third system.
And BTW: how do You synchronize databases?
Ultimate Tool Kit for Technology Solution Provider

Broken down into practical pointers and step-by-step instructions, the IT Service Excellence Tool Kit delivers expert advice for technology solution providers. Get your free copy now.

nociSoftware EngineerCommented:
Like i said, OpenVMS does this trick allready >25 years also long distance so nothing realy new there.

I known linux is fresh into this kind of cluster business,
A GFS shared disk might be needed to share that database, on mirrored devices over all locations.
PyromanciAuthor Commented:
Sorry for the late reply, became really busy here.
The databases are synced with master to master replication. the sync it's self is check via a script I wrote that runs at the end of the day to validate information between the 2.

The split brain issue actually is not a concern for me. When i have used HA in the past i've used it inconjuction with DRBD and I had it set to never switch back over if the primary node came back online. reason being was I had to let DRBD get caught up on the master and doing some validation checking on it. then I would manually tell it take over out side of business hours.

See the problem we have right now is every now or then 1 of 2 things will happen. A). The iptables on the machine become overloaded and lock up (this is due to heavy hacking attempt traffic that just overload the nic). B). Our current host provider has a issue with their network at the data center (this is not their fault it's a issue with their backbone provider and they working on resolving issues, though the cause is unknown at the moment).

So when one of those things happen. Typically I go through and do the DNS change to point the secondary server. though this could be anywhere from 10minutes to a hour after the problem has occurred.
PyromanciAuthor Commented:
Wasn't a complete solution, but pointed me the direction i needed to go.
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.