Link to home
Start Free TrialLog in
Avatar of sara2000
sara2000

asked on

Esxi management service

We have about 10 Esxi hosts in the cluster. We monitor these Esxi host connectivity by using software which uses ping command. From time to time for few minutes , We do not get response on time( 30 ms) from an esxi host . It is basically pinging to the management ip of the Esxi host. Where do I look for the issue? I know the esxi is up and running without rebooting.
Avatar of Andrew Hancock (VMware vExpert PRO / EE Fellow/British Beekeeper)
Andrew Hancock (VMware vExpert PRO / EE Fellow/British Beekeeper)
Flag of United Kingdom of Great Britain and Northern Ireland image

Check Physical networking, which is connected to the Management Interface of the hosts.

BUT it should be Services you are monitoring....not necessarily the hosts.

e.g. also does vCenter notify of a disconnection, or loss of connection ?
Avatar of sara2000
sara2000

ASKER

Did not get alert from vCenter. I checked the physical nic.
I believe this has nothing to do with management services , We should be able to ping to the IP even if the mgmt service down, am I correct ?
Correct, IP Address is assigned to Management Network e.g. VMKernel....

time to investigate your networking....

You need to look at nic, firmware, physical switch firmware, routes, vlans, ports, cables, teaming, physical switch error counters....

how often, and what is your baseline, maybe your network is heavily loaded.....saturated - backups, vMotion and packets get delayed.

If we were wanting to MONITOR a Host for uptime, we would not use Management Network, we would use another private isolated IP Address.

So why did you pick this IP Address and Management Network?

What Service are you monitoring ?

you state - Esxi host connectivity  ?

what is this ?

do all your VMs use the same interface , switch ports and IP Address range ?
Good item to check is the time frame when this occurs.  We have seen this happen during the backup window due to the load on the host during this time as well as load from monitoring application, etc..
The monitoring software is pining to the hostname(myesxihost) and the DNS A record is pointing to the ip address of the management network. We are not monitoring any services. There is not any vMotion activities at that time.
As Paul mentioned , it is only happening during the backup window !. We have a shared storage, Will the backup snapshotting  and consolidation will put load on the Esxi host or delay the ping response? I wonder why it does not happen to other Esxi host? The iscsi storage is on another vLAN and they have their own iscsi nic card.
We have seen several backup applications cause this exact issue.  With Commvault we have this seen this more often but with Veeam we can throttle the load, our backup methodology is via iSCSI connections with the proxy servers to our backend storage.  What backup software are you using? I would focus on the backup software versus the other parameters.
But what does your test tell you ?

you've just lost a ping because the network was BUSY!

Have you looked at the network counters during this time ?

NETWORK SATURATED?

Backup across the network ?
I moved all the VMs away from this Esxi host to another ESxi host and do not get delayed ping from other Esxi hosts.
All the Esxi hosts are similar make and model and we use vDS.
ASKER CERTIFIED SOLUTION
Avatar of Andrew Hancock (VMware vExpert PRO / EE Fellow/British Beekeeper)
Andrew Hancock (VMware vExpert PRO / EE Fellow/British Beekeeper)
Flag of United Kingdom of Great Britain and Northern Ireland image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
We changed the backup schedules and performed firmware updates. Not sure whether the problem is due to firmware or backup.
Waiting to see whether we get same problem again.