HA failing on two VMware hosts

We have a 4-host vSphere cluster. (ESXi at 6.0. vCenter at 6.5.)
To expand vSAN, we had to reboot all hosts. Now HA is failing on 2 of them. (vSAN is healthy after initial congestion.)
- Have tried "Reconfigure for vSphere HA" many times.
- After putting in Maintenance mode, removed a problem host from inventory. Later added back to the cluster. Made no difference.
- Rebooting has not helped.

Message: 'Cannot complete the operation due to an incorrect request to the server'
Event log says: 'vSphere HA agent for this host has an error: vSphere HA agent cannot be installed or configured.'

Have looked at many KB articles including #2056299. It talks about checking fdm-installer.log file but no such file exists on any of the host, HA working or not. Have not changed configuration of the hosts in at least 6 months, other than expanding vSAN.

Please advise. Thanks.
LVL 3
AkulshAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
Common issue, and nothing to do with vSAN.

When you Reconfigure for vSphere HA, when does it fail, percentage ?

How have you installed ESXi, on a USB flash drive or SD card ?

the log should be on the server ?

Have you tried uninstalling the HA (fdm) vib agent, and trying again ?
0
AkulshAuthor Commented:
Dear Andrew,

On both hosts, it fails at 26%.
Hosts are installed on SD card.
FDM.log was no help.
How do I uninstall HA vib agent -- which KB describes the procedure?
Thanks.
0
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
suggests it's copying and failing...

check storage space is FREE on the SD card, common problem, runs out of storage space, and cannot copy the new HA Agent (VIB) archive, and unpack, and run on the SD card, so fails.

you also using DELL OEM ESXi version ?

esxcli software vib remove -n vmware-fdm

Open in new window


you can read it here

https://kb.vmware.com/s/article/2056299
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Protecting & Securing Your Critical Data

Considering 93 percent of companies file for bankruptcy within 12 months of a disaster that blocked access to their data for 10 days or more, planning for the worst is just smart business. Learn how Acronis Backup integrates security at every stage

AkulshAuthor Commented:
Dear Andrew,

The problem is Fixed! So happy I am.
All I had to do is run:
esxcli software vib remove –n vmware-fdm
After that "Reconfigure for vSphere HA" took many minutes but it worked.

Few points
- KB2056299 is very poorly written. It talks about fdm-installer.log file which is nowhere to be found. It also mentions dependency created by a third party VIB, which supposedly had to be removed first. No such need.
- In our case, problem probably happened because we had to reinstall one ESXi (we used Dell's ISO for all hosts) server due to failed SD card controller. This newly installed host became HA master and did not let 2 hosts join HA because their vib had older date, though same version. Strangely and thankfully, one old host with old file had no issue.
- I had found this posting which helped: https://tinyurl.com/y96ao44e

Thanks.
AKK
0
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
your SD card was full! it happens with the ESXi OEM versions.

and the problem is, there is no space to

1. copy new HA Agent from vCenter Server to host /tmp
2. Extract it to /tmp
3. Execute it and install it to bootbank!

very common issue, and can occurs everytime you update your vCenter Server in the future....

so write a document, so next time it happens you know what to do!
0
AkulshAuthor Commented:
Andrew knows more than most VMware Support engineers...
0
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
Thanks for your kind words!
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
VMware

From novice to tech pro — start learning today.