We help IT Professionals succeed at work.

ESXi 6.7 Remote Login issue

Dan Henery
Dan Henery asked
on
I have an odd issue with a vSphere 6.7 environment. 2 Hosts that have been running fine for about 4  months are now exhibiting connection errors. I disconnected and removed them from inventory in vCenter and they were removed without error, but I cannot get them back in. I keep getting incorrect login. I cannot event log into the local WebUI of the hosts. I get an error saying username or password are incorrect, I cannot SSH to them either. However, connecting via DCUI through F1 or F2 logins work without issue....
Comment
Watch Question

Andrew Hancock (VMware vExpert / EE Fellow)VMware and Virtualization Consultant
Fellow 2018
Expert of the Year 2017

Commented:
Did you enable lockdown mode?
Dan HeneryLead Infrastructure Engineer

Author

Commented:
No Lockdown is not enabled...
Dan HeneryLead Infrastructure Engineer

Author

Commented:
Now that I look again it is actually Greyed out in the DCUI
Andrew Hancock (VMware vExpert / EE Fellow)VMware and Virtualization Consultant
Fellow 2018
Expert of the Year 2017

Commented:
Dan HeneryLead Infrastructure Engineer

Author

Commented:
Saw that Article and attempted the command line method at the bottom, but getting an authentication login error there too
Andrew Hancock (VMware vExpert / EE Fellow)VMware and Virtualization Consultant
Fellow 2018
Expert of the Year 2017

Commented:
It looks like Lockout Mode is enabled ?

SSH access ?
Dan HeneryLead Infrastructure Engineer

Author

Commented:
SSH is enabled, but get connection refused, telnet to port 22 also fails i am looking into firewall
Dan HeneryLead Infrastructure Engineer

Author

Commented:
Cannot run that command either esxcli network firewall get returns General System Error: Internal error on both Hosts...
Andrew Hancock (VMware vExpert / EE Fellow)VMware and Virtualization Consultant
Fellow 2018
Expert of the Year 2017

Commented:
Why did you remove them from vCenter Server ?
Dan HeneryLead Infrastructure Engineer

Author

Commented:
It all started because Veeam backup was failing do to NFC storage connection errors and I found a VMware knowledge base that suggested disconnecting and reconnecting the Hosts in vCenter. I disconnected but could not reconnect do to login failures so I removed them from inventory and tried to add them back in which put me where I am at right now.

I'm almost ready to reboot the host but my only access to shut down the VMs is through the Guest OS and if I still cannot get into WebUI after the reboot I'll have no way of powering them back up. Best I can do is hard boot and hope they come back on line which I am very reluctant to do.
Dan HeneryLead Infrastructure Engineer

Author

Commented:
could a ramdisk full situation be the cause?
Andrew Hancock (VMware vExpert / EE Fellow)VMware and Virtualization Consultant
Fellow 2018
Expert of the Year 2017

Commented:
yes, or scratch full, temp
Dan HeneryLead Infrastructure Engineer

Author

Commented:
So I can be reasonably co fident that after a reboot I'll be able to log into the WebUI ?
Andrew Hancock (VMware vExpert / EE Fellow)VMware and Virtualization Consultant
Fellow 2018
Expert of the Year 2017

Commented:
There are no guarantees.
Dan HeneryLead Infrastructure Engineer

Author

Commented:
when I run vdf -h I see that "var" and "tmp" are 100% Use on Host 1 and Host 2 just "var" is 100%. I opened a ticket with VMware and will try to have them on the line when I reboot.
Andrew Hancock (VMware vExpert / EE Fellow)VMware and Virtualization Consultant
Fellow 2018
Expert of the Year 2017

Commented:
okay, running out of space.....

look at logs and delete them, same with /tmp
Dan HeneryLead Infrastructure Engineer

Author

Commented:
I only have shell access should I just delete the file or is there another way to clear contents but leave file?
Dan HeneryLead Infrastructure Engineer

Author

Commented:
Also a file size of 12288.. this is KB correct?
Andrew Hancock (VMware vExpert / EE Fellow)VMware and Virtualization Consultant
Fellow 2018
Expert of the Year 2017

Commented:
you need to look at the contents of

/var/log and /tmp

and either copy the files off using Winscp, and check if that clears up the space if 100% full.
Dan HeneryLead Infrastructure Engineer

Author

Commented:
Can't do that because SSH and SCP connections are refused even though both are SSH is enabled on the hosts
Dan HeneryLead Infrastructure Engineer

Author

Commented:
I think I have no choice but to reboot at this point...
Andrew Hancock (VMware vExpert / EE Fellow)VMware and Virtualization Consultant
Fellow 2018
Expert of the Year 2017

Commented:
oh, I see - this can happen when it fills up.....
Dan HeneryLead Infrastructure Engineer

Author

Commented:
OK... so I finally got VMware to remote in and they found an HPE VIB was part of the issue and need to apply this update "esxi6.7uX-mgmt-bundle-3.4.5-8.zip" They deleted a file named ams-bbUsg.txt from the "tmp" folder.

In the var there was a file /var/log/mili/mili2d.txt which is related to a QLogic driver. They deleted that, but it requires a reboot to reflect  recovered space.  There is one more step to modify a config, but that will be done tomorrow so I will update that process when complete.
Andrew Hancock (VMware vExpert / EE Fellow)VMware and Virtualization Consultant
Fellow 2018
Expert of the Year 2017

Commented:
seen the same problem before with iSCSI filling up the log space!

this can also affect FDM agent deployment and failure because no space!
Lead Infrastructure Engineer
Commented:
Just a final post.. I worked with VMware and we had to change a config in the /etc/config/EMU/mili folder. the file name is libmili.conf in that file there is a logging threshold parameter set at 4 and needs to be changed to 0