Load balancing / Failover solutions

I am looking for some general advice on implementing a failover solution for a website.  We currently have three Windows 2003 servers, as follows:

Web server: running the website (IIS 6)
Database server: runs a MySQL database
Backup server: redundant server running both IIS & MySQL.  Web files mirrored using ViceVersa and database is replicated to this server.  Database replication is set up on a dual master basis - so both databases can keep in sync with each other.

The database and web servers are in the same data centre, the backup server is located with a different host in a separate data centre.

At the moment our only failover solution is to switch DNS.  The TTL for our DNS is set to one hour.  In the event of either of the primary servers failing we would update DNS.  My concern is that this method is too slow.  I'm told that some ISPs (notably AOL) can take 48 hours or more to pick up DNS changes.  It also relies on us manually changing the DNS settings.

I have been looking at three alternatives.  I would appreciate any comments or experiences:

1. Improved DNS failover.  There are several services available that can automatically update DNS if the primary server goes offline.

2. Load balancing.  We add a load balancer in front of the web servers.  This can then be easily redirected if one of the servers fail.  I'm not sure if a hardware or software solution is best here?

3. Web farm. I have read quite a bit about setting up a web farm using IIS.  It looks like it might be quite complicated to implement, but is perhaps the best solution because it makes use of the currently redundant backup server.  However, although this improves reliability, surely it still doesn't help if the main server is unavailable for some reason?

Perhaps I need a combination of these things, or perhaps there is a better solution available?
Who is Participating?
Ted BouskillConnect With a Mentor Senior Software DeveloperCommented:
DNS should only be used for routing, it's not a quality choice for high availability or load balancing because it can be cached on the client and as you know, can take hours to propagate changes.

Load balancing and failover are two different issues and you're biggest single point of failure is MySQL which does not support any clustering unless you purchase the commercial version.  Replication will not provide automatic failover and to be honest because it adds overhead (to sync data) and it can be dangerous because if the primary database becomes corrupt, the corrupt data will be replicated into the copy.

Windows Network Load Balancing is free and is an optional install.  You setup a virtual IP with an DNS entry pointing to the virtual IP.  Traffic is routed automatically and can be managed.  You can use Affinity to control keeping sessions on the server they started if you use memory sessions.  If you choose to use Windows optional SQL or State Services (both have pros/cons) you do not need to set affinity.  The downside to Windows NLB is automatic failover only occurs if the OS or server goes down.  If the web application fails it will still send page requests to the failed site.

Hardware load balancing is very expensive.  If you want high availability you need two appliances and it also has the same flaw in regard to automatic failover unless you purchase expensive ones.

Web farms in IIS are actually easy to implement if a web application is well designed and doesn't hard code configuration settings (like server names) and doesn't overuse sessions.

Designing for high availability and load balancing is a complex topic.  It requires thinking about every aspect of the web application design and has to include even the hardware because there are so many pros/cons and choices.
OrrolandAuthor Commented:
One more thing which I forgot to mention.  The user needs to maintain the same session so any load balancing solution would need to address this.
cj_1969Connect With a Mentor Commented:
I agree with tedbilly that DNS is not a good choice for load balancing ... as for failover or DR, it is implemented in many places for this purpose.  Most places that use this method use it for a complete data center fail over for a DR situation.  Typically they have implemented clustering of some sort for high availability of individual systems.  If load balancing is going to be implemented typically it would means high availability for a single system and you would have both machines in the same data center and then possibly have 1 system in an alternate data center as a kind of "limp along" back up in the case of a loss of the main data center.   Unless they are mission critical apps then you typically do not beef up the DR systems to the same level as production.

To go back to your question ...
I don't think a hardware solution is what you want.  To make this work, in general, you would either have s single point of failure or you are buying 2 devices and essentially clustering them so that if the main one fails the other takes over.  tedbilly has given options for the same thing for free using software.  

Again, you need to look at the entire data center, applications, severs and time to recover both the individual apps/server and the entire data center ... you might need to look at doing a combination of things to cover everything depending on your requirements.
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.