Solved

VMWare ESXi Host Disconnects from Vcenter

Posted on 2014-02-10
28
5,166 Views
Last Modified: 2014-12-02
Afternoon All,

My organization is having an intermittent problem with our ESXi 5.0 servers.  Last week we witnessed our three ESXi hosts not respond and then disconnect from vcenter.  After going through the KB article from VMWare on what to do, we could only complete a hard restart of the servers (we know now the best practice at all).  

Today, I witnessed another server reach the not responding state and then disconnect from vcenter.  We never had a purple screen on the console.  This time, I was not able to F12 into the screen and attempt to issue a restart command.  Last time I was able to; however, it just hung for almost and hour and we were required to do a hard restart.  

I was hoping someone could give me suggestions as to where I should look for problems or has anyone had this problem/encountered this problem?  It's quite frustrating because I'm basically always anticipating some type of severe issue every couple of days.  Also, every time this happened, I've still been able to access the servers running- i.e. Exchange.

Thanks in advance,
0
Comment
Question by:Anthony6890
  • 15
  • 11
  • +1
28 Comments
 
LVL 117
ID: 39848612
1. Check server hardware is compatible and certified to run ESXi 5.0.

2. Has this just started to occur?

3. Are you on the latest patch release of 5.0.0 U3 plus patch 12 (e.g. latest fixes from VMware for ESXi 5.0).

4. Is this a LAN or WAN connection?

5. Again, the same with vCenter, is this the latest version of 5.0?

6. What is the overall use of Host resources, e.g. CPU and Memory, are we at 100%?

7. Any network changes, what is network topology?

8. Is an iSCSI SAN attached? if so what SAN?

9. Can the ESXi servers be "pinged" when they disconnect? from vCenter Server?

10. Can you ping from ESXi server to vCenter, when they disconnect?

11. Can you re-connect the servers?

12. Have you tried Restarting Network Management Agents on each host, when disconnected?

There are lots of questions, there to help us, diagnose this issue.

(it's quite common!)
0
 
LVL 1

Author Comment

by:Anthony6890
ID: 39848842
Thanks for responding back, here are the answers:

1.  Yes we have checked all equipment for esxi5 compatibility.

2.  Yes, this issue just started last week.  We went 3 days before we observed the similar issue, just not as severe since it only happened to one host and not all 3.

3.  I will review the latest patches. I don't know if we are at the latest.  

4.  This is a LAN connection.

5.  I do know Vcenter is the latest version.  

6.  For CPU usage, we are very low on under, 3%, for memory we are between 50-60% used on each server. There is also plenty of space. The smallest amount of space available is 400Gb.

7.  No recent network changes.  We installed a new firewall about a month ago. Also, we are simulating a 100mbp bandwidth for wan simulation. That has been in place for about 3 weeks now, with no issues. We only observed a max of 50mbp at our high.

8.  Yes, we have 2sans. They are both IBM SD2350

9.  Yes the servers can be pinged, but I have not tried from the Vcenter server.  

10.  I have not tried pinging from esxi server to vcenter.

11.  We cannot reconnect from vcenter   Each time we had to do a hard restart to get them to reconnect.

12.  Yes, we tried restarting the agents, but it was unsuccessful.
0
 
LVL 13

Expert Comment

by:Abhilash
ID: 39849062
Can you check the logs files or upload them here so we can see what's making this happen?
0
 
LVL 117
ID: 39849439
check the /var/log/vmkernel.log
0
 
LVL 1

Author Comment

by:Anthony6890
ID: 39850203
Morning guys, here is the log for the server that we observed this morning that was disconnected.  

Our consultants required a hard restart of this.
0
 
LVL 1

Author Comment

by:Anthony6890
ID: 39850226
Here is the log for the server that disconnected yesterday.
vmkernelforESXi3.txt
0
 
LVL 1

Author Comment

by:Anthony6890
ID: 39850230
I forgot to attach the log for the server that disconnected this morning.  It's attached here.
ESXI1-vmkernel.log
0
 
LVL 117
ID: 39850249
I would certainly test, and use ping -t, to keep continuous pings between ESXi and vCenter Servers, to rule out any communication issues, which also, highlights just checking no strange firewall rules causing this issue.

I'll look at logs...
0
 
LVL 1

Author Comment

by:Anthony6890
ID: 39850252
Will do Andrew.  Thanks for the suggestion.
0
 
LVL 117

Accepted Solution

by:
Andrew Hancock (VMware vExpert / EE MVE) earned 500 total points
ID: 39850257
Is you SAN iSCSI, and are you having any path issues, or have you re moved any LUNs recently, without un-mounting the the LUNs or Masking the LUNs before removal?

The reason I mention this, is iSCSI does have a bug, where if you just remove LUNs, the ESXi server, can go into a "loop" waiting and polling for the LUNs to come back, and starts to become un-repsonsive, disconnects from vCenter Server.

and just quickly looking at the logs, I can see some datastore, volume, path issues.

what's up with this LUN

 "naa.60080e50002483ee0000031d4f157b20" on path "vmhba38:C3:T0:L1" Failed:

WARNING: VMW_SATP_LSI: satp_lsi_pathIsUsingPreferredController:714:Failed to get volume access control data for path "vmhba38:C2:T0:L1": Failure

VMW_SATP_LSI: satp_lsi_updatePath:680: Failed to update path "vmhba38:C2:T0:L1" Failure
0
 
LVL 1

Author Comment

by:Anthony6890
ID: 39850268
Yes, the SAN iSCSI.  How would I know if I am having any path issues?  We havne't removed any LUN's recently, but I can check with our consultants to see if they did anything.
0
 
LVL 117
ID: 39850285
Any datastores, that are disconnecting?

do you have a datastore called VDI?
0
 
LVL 117
ID: 39850294
check under Host Server > Configuration > Storage Adaptors > iSCSI Software ... > Paths

Are all paths Active, none Dead ?
0
 
LVL 117
ID: 39850298
Can you upload vCenter Logs?
0
Comprehensive Backup Solutions for Microsoft

Acronis protects the complete Microsoft technology stack: Windows Server, Windows PC, laptop and Surface data; Microsoft business applications; Microsoft Hyper-V; Azure VMs; Microsoft Windows Server 2016; Microsoft Exchange 2016 and SQL Server 2016.

 
LVL 1

Author Comment

by:Anthony6890
ID: 39850599
Hi guys, sorry for the delay.  Have been putting out some various fires.  

Yes, I do have a datastore called VDI.

When I go to the Configuration for the three Hosts, all paths are active, only some are in Stand by.  

Yes for the vcenter Logs, I will get them for you.  

Again, thank you again.
0
 
LVL 1

Author Comment

by:Anthony6890
ID: 39850626
Andrew, which logs would you like, all of them?
0
 
LVL 117
ID: 39850636
vpxd.logs
0
 
LVL 1

Author Comment

by:Anthony6890
ID: 39850659
Ok, give me a couple of minutes to get all the logs populated.
0
 
LVL 1

Author Comment

by:Anthony6890
ID: 39850770
Here were the most recent logs that I found.
vpxd-199.log
0
 
LVL 1

Author Comment

by:Anthony6890
ID: 39850774
Here is the most recent log.
vpxd-200.log
0
 
LVL 1

Author Comment

by:Anthony6890
ID: 39851441
Guys, I just got off the phone with VMWare who also reviewed some of the logs by logging in via PuTTY.  They have determined that it is a storage related issue and not a network related issue.  Our SAN's are IBM DS3512 model's so we will be communicating with them.  

I'll keep you posted on the more detailed underlying issue.
0
 
LVL 117
ID: 39851596
That's what I wrote in http:#a39850257

The reason I mention this, is iSCSI does have a bug, where if you just remove LUNs, the ESXi server, can go into a "loop" waiting and polling for the LUNs to come back, and starts to become un-repsonsive, disconnects from vCenter Server.

and the VDI datastore, seems to crop up, in the logs!

What version of ESXi 5.0, build are you using, these issues were supposed to have been resolved in U3.

But, I've seen several issues like this in 5.1 and 5.5, when a LUN hangs..... and ESXi starts polling....
0
 
LVL 1

Author Comment

by:Anthony6890
ID: 39853417
Morning Andrew,

Sorry, yes I do know you said that that the datastore could be the issue, I just needed more confirmation with the log review.

For ESX information we are furnning ESXi, 5.0.0, 474610
0
 
LVL 117
ID: 39853444
That's quite an early version of ESXi 5.0.

The latest build is U3, 1489271.

It might be worth a test of upgrading.
0
 
LVL 1

Author Comment

by:Anthony6890
ID: 39853956
We actually just had another server get disconnected from vcenter.  VMWare was available to review the logs and again solidified that it is a storage issue.  They informed us to contact the SAN vendor, IBM, to investigate further.

Thank you for your help with this.
0
 
LVL 1

Author Closing Comment

by:Anthony6890
ID: 39853959
Was spot on with the issue, we are reaching out to the storage vendor for more information as to why we are having issues with the SAN.
0
 
LVL 1

Expert Comment

by:BitTrekker
ID: 40477463
What was your fix for this?
0
 
LVL 117
ID: 40477495
@BitTrekker:- Post a question, and myself or fellow VMware Experts, can assist you with your problem.
0

Featured Post

Highfive + Dolby Voice = No More Audio Complaints!

Poor audio quality is one of the top reasons people don’t use video conferencing. Get the crispest, clearest audio powered by Dolby Voice in every meeting. Highfive and Dolby Voice deliver the best video conferencing and audio experience for every meeting and every room.

Join & Write a Comment

Suggested Solutions

This article will show you how to create an ISO CD-ROM/DVD-ROM image (*.iso), and MD5 checksum signature, for use with VMware vSphere Hypervisor 6.5 (ESXi 6.5). It's a good idea to compare checksums, because many installations fail because of a corr…
In this article, I will show you HOW TO: Create your first Windows Virtual Machine on a VMware vSphere Hypervisor 6.5 (ESXi 6.5) Host Server, the Windows OS we will install is Windows Server 2016.
Teach the user how to configure vSphere Replication and how to protect and recover VMs Open vSphere Web Client: Verify vsphere Replication is enabled: Enable vSphere Replication for a virtual machine: Verify replicated VM is created: Recover replica…
This Micro Tutorial steps you through the configuration steps to configure your ESXi host Management Network settings and test the management network, ensure the host is recognized by the DNS Server, configure a new password, and the troubleshooting…

707 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

15 Experts available now in Live!

Get 1:1 Help Now