Avatar of Oscar Powers
Oscar Powers
Flag for United States of America asked on

Troubleshoot VMWare VCenter issue

esxi vCenter 6.7, NICs for VM traffic go down suddenly.

We have three esxi server with this physical network  configuration:

vmnic0 & vmnic1 – 1Gb copper interfaces for management and vMotion
•   Configured via vSwitch0
•   Management is isolated to vmnic0; can fail over to vmnic1
•   vMotion is isolated to vmnic1; can fail over to vmnic0
vmnic2 & vmnic3 – 10Gb fiber interfaces for iSCSI storage and VM traffic; configured via DSwitch01
•   VM traffic traverses both interfaces
•   iSCSI traffic is load-balanced across both physical interfaces, via two vmkernel ports bound to each interface (noted below); this allows for two paths to each datastore
Starting two weeks ago on one of the esxi vmnic2 & vmnic3 when down but at the switch side the port is up.  We lost contact will all server in this host, vMotion fails in these machines.  

The only solution is shutdown the esxi server. When it is back NICs are up.

I open a case with VMWare, they do not find software issue, same with DELL no hardware problem.

I notice that at the time of the three events a VEEAM backup of the file server was running and the VEEAM server was in in the faulty esxi (one time in #1 and two times in #3).

Any idea how to start to troubleshoot this issue

VMwareVirtualization

Avatar of undefined
Last Comment
Oscar Powers

8/22/2022 - Mon
Andrew Hancock (VMware vExpert PRO / EE Fellow/British Beekeeper)

It-s-VMware.png
Now that's out the way!

Are you referring to

1. The host VMNIC ?
2. The virtual machine VM (maybe VCSA)

If you've had VMware and DELL look at the issue with a hands on remote approach and look at logs, we can try to help, but we are limited in our actions!

Can we have screenshots of your networking on each host ?

we've got some quick fire, recommendations

1. Management Interfaces - vmnic0 and vmnic1 - Management Only. vSwitch0.

2. vMotion Interfaces - vmnicX and vmnicY - vSwitch1 - vMotion only - enable Jumbo Frames if hardware allows.

3. iSCSI Interfaces - vmnic2, and vmnic3 - vSwitchX - iSCSI only - nothing else, dedicated Storage Network.

This can be done with physical interfaces or VLANs

All services and storage networks should be on their own networks, and VMs on their own vSwitch and networks.

By losing iSCSI connections to the SAN, would render all datastores lost and VMs hanging, at the same time no communication to them, anyway.

Is this just a single host which does this ?

Have you updated firmware and or host ESXi 6.7 recently, which build of 6.7 are you using ?

Aer you using DELL OEM ?
David Johnson, CD

how many network interface ports are on the physical machines? Looks like you should have 6
Oscar Powers

ASKER
We have three server PowerEdge R440
6.7.0 Update 3 (Build 17700523) Image profile (Updated) ESXi-6.7.0-20191204001-standard (VMware, Inc.)
This pictures were taking during the event.

The three servers have the same configuration

This event happens already three times one in esxi1 and two times in esxi3
Experts Exchange is like having an extremely knowledgeable team sitting and waiting for your call. Couldn't do my job half as well as I do without it!
James Murphy
Andrew Hancock (VMware vExpert PRO / EE Fellow/British Beekeeper)

Is this a new implementation, of has it been running for years and now has issues ?
Oscar Powers

ASKER
It is running for years.
No sure, but I noticed that when the problem showed up the backup server and file server were in the same host. I change settings to have both servers on different host.
I have a week without the issue. 
Andrew Hancock (VMware vExpert PRO / EE Fellow/British Beekeeper)

Maybe you need to revisit your network setup as suggested, if you have VM traffic and iSCSI traffic on the same network.
Get an unlimited membership to EE for less than $4 a week.
Unlimited question asking, solutions, articles and more.
Oscar Powers

ASKER
Thanks for your help Andrew, I will check the network setup.
I like your recommendation of "All services and storage networks should be on their own networks, and VMs on their own vSwitch and networks."
This will take me some time because I not expert on VMWare.  I have to do a little of everything, I need to do a lot of research. Maybe add extra physical NIC.
I do not have issues with the NICs in a couple of weeks.  I have to assume that the issue was relation between the new device USB Anywhere and the fileserver with the backup server.
SOLUTION
Andrew Hancock (VMware vExpert PRO / EE Fellow/British Beekeeper)

Log in or sign up to see answer
Become an EE member today7-DAY FREE TRIAL
Members can start a 7-Day Free trial then enjoy unlimited access to the platform
Sign up - Free for 7 days
or
Learn why we charge membership fees
We get it - no one likes a content blocker. Take one extra minute and find out why we block content.
Not exactly the question you had in mind?
Sign up for an EE membership and get your own personalized solution. With an EE membership, you can ask unlimited troubleshooting, research, or opinion questions.
ask a question
ASKER CERTIFIED SOLUTION
Log in to continue reading
Log In
Sign up - Free for 7 days
Get an unlimited membership to EE for less than $4 a week.
Unlimited question asking, solutions, articles and more.