We help IT Professionals succeed at work.
Opinion Question

Best Fault Tolerant Technology in VMware 6.7 with Windows and Linux.

cgeorgeisaac
cgeorgeisaac asked
on
92 Views
Last Modified: 2020-09-14
We have a few mission critical servers in our VMware 6.7 infrastructure.  I need to make these servers fully Fault Tolerant.  I tried the VMware tecnology of FT; however this works only if the ESXi Host is shutdown.    I need a Fully Fault Tolerance Technology that will keep 2 VMs in Active / Standby mode so when One VM goes OFF, the other VM will take over without impacting production.   The OS we have is Windows 2019 and SLES.    Would appreciate your advice and guidance on the various  Fault Tolerance Technology that I can use.   Thank you experts.  
Comment
Watch Question

AlexA lack of information provides a lack of a decent solution.
CERTIFIED EXPERT

Commented:
Couldn't you use something like windows failover clustering, then use DRS rules to ensure both nodes are on different hosts, that would be my best recommendation.

Cheers
Alex

cgeorgeisaacSenior Engineer

Author

Commented:
Thank you for your prompt response Alex.  I have never tried that to be honest. Will try that for sure. Any good guide that you wud like to recommend.  Also, any suggestion for Linux?
A lack of information provides a lack of a decent solution.
CERTIFIED EXPERT
Commented:
This one is on us!
(Get your first solution completely free - no credit card required)
UNLOCK SOLUTION
CERTIFIED EXPERT

Commented:
DRS rules is a good choice, but for Windows clustering or load balancing is probably a the best option.  If it is domain joined, you cant just turn them off for extended periods of time and expect them to just work, could not comment on SLES.
cgeorgeisaacSenior Engineer

Author

Commented:
Many Thanks Alex.   We are rather old school. Use a Database called  InterSystems Cache. Tons of homegrown applications too.  Mainly will be used for FileServers and Storage and DFS. Since we have moved in recently to a Domain Environment.  

Many Thanks Bryant.   When you mentioned  DRS, I guess you meant the VMware Dist.Resource Scheduler.   Not sure if we cud do a  full FT on that other than just Resource sharing from the Cluster Pool.   Yes will be trying out Windows Clusters.   Any information on Load Balancing will be appreciated. 
CERTIFIED EXPERT

Commented:
File servers/Storage using DFS is a good bet, the database servers would be another story.  You may have to talk to the vendor.  A down and dirty thought was to use a DNS CNAME and just have anyone use that, then just change the server it is pointing to if it fails.  This would be manual or maybe scripted based on an event.  Not a clean solution like a cluster
kevinhsiehNetwork Engineer
CERTIFIED EXPERT

Commented:
You're looking for HA of the applications, right? FT, as typically understood, protects the VMs against failure of the host, but will not protect against failure of your application or the VM itself.
Causes of host failure could include a bad CPU, loss of motherboard, bad RAM, or power loss.
Causes of VM failure could include operator error, bad patch, ransomware, OS crash, configuration error, etc.
You could have further issues up the stack such as deletion of data in the database, or application logic change.
The easiest thing to do is at the bottom of the stack and do VMware clustering, where if a host dies the VM gets restarted on another host. This can be improved upon by doing fault tolerant, where if a host dies then the VM continues to run on the other host. Neither of these do anything to protect against issues inside the VM itself.
To protect against higher level issues, it totally depends on the applications you are trying to protect. Domain controllers have their own application level redundancy. You may or may not have any options with your database aside from a good backup and recovery plan.
Know that DFS Replication has all sorts of limitations.
Protecting availability at the OS level and above really has nothing to do with VMware, if that helps. You have to work the problem just like they were physical machines. (Okay, Windows clusters are easier with VMs than physical, but the concept is the same, it's just that the incremental hardware for for a VM and the shared storage is way less because you probably already have it).
cgeorgeisaacSenior Engineer

Author

Commented:
Thanks Bryant - Great thought but probably may not venture that  (DNS CNAME) in a Prod environment.

Thanks kevinhsieh - precise and to the point explanation.   Highly appreciated.    As you rightly pointed out DFS Replication has its own limitations, esp the Writes (Files) need to be closed to ensure the Files are Replicated.  

Hit upon this technology called Storage Replica (one step above DFS-Replication).  Storage Replica is a new Windows Server technology that allows you to replicate the content of your volumes between servers or
 clusters for disaster recovery.
  https://nedim.cloud/2018/10/24/how-to-configure-storage-replica-with-windows-admin-center-in-windows-server-2019/ 

As I understand the advantage is, it replicates without the Files (Writes) being closed.  Not yet tried it. But wud love to try it later after I set up my FailOver Cluster.

That's great advise from  Experts!

kevinhsiehNetwork Engineer
CERTIFIED EXPERT

Commented:
Do give you an idea of what is in place in my environment, I do DFS Replication of user file data from remote offices to central offices.
Active Directory has multiple domain controllers to provide HA at the AD level.
Most VMs in the data center are in a hypervisor cluster. This includes DCs, SQL, mySQL, and Exchange Servers.
SQL and Exchange uses native replication and HA technologies in addition to hypervisor clustering.
I have super critical (lives depend on them) VMs running on FT hardware. A motherboard can fail and the VM will keep running. This is on Stratus FT hardware. Some of these VMs are linux, others are Windows. The data does get replicated in real time to other systems, but the application level failover is disruptive enough that we try to avoid uncontrolled failover as much as possible...hence the FT hardware.
cgeorgeisaacSenior Engineer

Author

Commented:
kevinhsieh - Wow that is indeed a fantastic well architectured Infrastructure!   l like that.

I think I will be  interested in this "Stratus FT hardware"  -  may I ask:  I understand the data gets replicated. But how does the  Server (vm) level or application Failover work on this? Does it work like the Windows failover cluster? ie. If one Server or Application stops the other (passive/standby) takes over immediately??  Never knew such technologies existed to be honest!! Much appreciated for sharing this.  

Also, if I may ask do you have a good guide to configure DFS-Replication for Storage?  The idea is to have a few mission critical servers to be added to this.  

Many thanks again.

kevinhsiehNetwork Engineer
CERTIFIED EXPERT

Commented:
Stratus FT works much like VMware FT, but is simpler. It has 2 motherboards in a single chassis. There is a VM that configures and brings them into sync. The same instructions are literally executed on each motherboard. Data is written to internal disks synchronously. In event of failover it moves active network from one set of NICs to another. There are two copies of the VM running, but VMware only sees 1. There are special instructions that have been in Intel CPUs for over 20 years to make this work.
cgeorgeisaacSenior Engineer

Author

Commented:
Thanks kevinhsieh!
well explained and my next step is definitely Stratus FT. 
Andrew WrightLooking for work in Edinburgh area
CERTIFIED EXPERT
Hi, just my two pence worth, you could also investigate Zerto, if you already have cloud space, you can use Zerto to keep an exact copy of your VM in the cloud, if the server should fall over for any reason the dupe in the cloud takes over, I believe the 'tollerance' can be as low as 3 seconds.
cgeorgeisaacSenior Engineer

Author

Commented:
That's great information Andrew. Many thanks.  
Unlock the solution to this question.
Join our community and discover your potential

Experts Exchange is the only place where you can interact directly with leading experts in the technology field. Become a member today and access the collective knowledge of thousands of technology experts.

*This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

OR

Please enter a first name

Please enter a last name

8+ characters (letters, numbers, and a symbol)

By clicking, you agree to the Terms of Use and Privacy Policy.