?
Solved

VMware isolation

Posted on 2014-03-24
15
Medium Priority
?
304 Views
Last Modified: 2014-06-08
I have read a little about isolation response and I was hoping to get some suggestions on a few things.

First, I have a cluster with two hosts in it With HA turned on.  Second, one of the hosts will stop responding and the only thing I can do is to physically reboot the host to get it back.  It will not respond to a ping or anything else once this has happened.

I have read about putting in a secondary isolation ip address for the cluster and also increasing the failure detection time.  Are these two things pretty safe to implement without any bad side effects?

Also, could there be a false positive that is causing an isolation of one of my VMware hosts?

I have had this happen in the past and it seemed to be related to our shared storage devices.  Please help!!  I have been struggling with this same problem for a while now and nothing seems to fix.  Thanks experts!!!
0
Comment
Question by:IKtech
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 8
  • 7
15 Comments
 
LVL 62

Expert Comment

by:gheist
ID: 39951634
I am sorry - HA just stops when hosts are isolated, VMs continue running on them.
0
 
LVL 3

Author Comment

by:IKtech
ID: 39951695
can I adjust any settings to keep the host isolation from happening?
0
 
LVL 62

Expert Comment

by:gheist
ID: 39951714
Are you isolating VM network and vmkernel network on different nics as per best practices?
0
Optimum High-Definition Video Viewing and Control

The ATEN VM0404HA 4x4 4K HDMI Matrix Switch supports 4K resolutions of UHD (3840 x 2160) and DCI (4096 x 2160) with refresh rates of 30 Hz (4:4:4) and 60 Hz (4:2:0). It is ideal for applications where the routing of 4K digital signals is required.

 
LVL 3

Author Comment

by:IKtech
ID: 39951738
yes, I believe so.
0
 
LVL 62

Expert Comment

by:gheist
ID: 39952524
Can I see how vswitches picture look in your ESXi (with whatever reminds of your organisation like net numbers or names blurred)?

Also what is your network switches' make and model
(some need configuring cache time and forward delay to make vmotion and HA happy - i.e if you have 3 switches between your ESXis it could just happen that gratuitous arp message is lost and one of them lives with belief that VM is still at other ESXi)
0
 
LVL 3

Author Comment

by:IKtech
ID: 39953717
both hosts are setup identical as far as the vswitches go.  Just some different IP addresses.  I can upload the second host config if you like.
vmware-host-not-responding.PNG
0
 
LVL 62

Expert Comment

by:gheist
ID: 39953763
vmotion and storage are heavy.. which vmkernel ports storage uses? does host get offline when you copy huge file?
0
 
LVL 3

Author Comment

by:IKtech
ID: 39954076
it does seem to happen when shared storage is working hard.  The last few times I have had trouble with this, I had a bad hard drive on one of my storage devices.  I am using QNap devices with 4 drives each with RAID 10.  I am doing block scans and smart tests on the drives to see if I can find an issue with a drive.  Also I am looking into getting new drives that are 6gb/s and 10k rpm vs. what I have now, 3gb/s and 7200rpm.
0
 
LVL 62

Expert Comment

by:gheist
ID: 39954125
3Gb/s could very well saturate 1Gb/s FT logging interface.
0
 
LVL 3

Author Comment

by:IKtech
ID: 39954388
I am currently not using any FT as a trouble shooting step and it still happens.
0
 
LVL 62

Expert Comment

by:gheist
ID: 39954482
Try altering between fixed media speed and auto-select...

Do you have two isolation hosts and they are all the time up?
0
 
LVL 3

Author Comment

by:IKtech
ID: 39967011
are you talking about the switch ports the vm host is connected to for the fixed media speead and the auto select?

I just have two hosts and yes they are up all the time.  Are there some settings I can verify to check with one or if both are isolation hosts?

Thanks!
0
 
LVL 62

Expert Comment

by:gheist
ID: 39967195
You need to check all interfaces on storage switch. If one of them goes doen at least one host is isolated.
0
 
LVL 3

Accepted Solution

by:
IKtech earned 0 total points
ID: 40109534
i have switched to faster drives in one storage device and I also had a drive failure in one storage device.  After the change to faster drives and rebuild the raid array on the other storage device I have not had any issues.  I may just change to faster drives on the second storage device at some point, but so far so good.  Looks like a bad hard drive causing me issues again.  Thanks for the help
0
 
LVL 3

Author Closing Comment

by:IKtech
ID: 40120216
replacing a bad drive seems to have fixed it for now.
0

Featured Post

What does it mean to be "Always On"?

Is your cloud always on? With an Always On cloud you won't have to worry about downtime for maintenance or software application code updates, ensuring that your bottom line isn't affected.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Last article we focus in how to VMware: How to create and use VMs TAGs – Part 1 so before follow this article and perform the next tasks, you should read the first article how to create the TAG before using them in Veeam Backup Jobs.
In this article, I show you step by step with screenshots to assist you - HOW TO: Deploy and Install the VMware vCenter Server Appliance 6.5 (VCSA 6.5), with some helpful tips along the way.
Advanced tutorial on how to run the esxtop command to capture a batch file in csv format in order to export the file and use it for performance analysis. He demonstrates how to download the file using a vSphere web client (or vSphere client) and exp…
This Micro Tutorial walks you through using a remote console to access a server and install ESXi 5.1. This example is showing remote access and installation using a Dell server. The hypervisor is the very first component of your virtual infrastructu…
Suggested Courses

800 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question