Link to home
Start Free TrialLog in
Avatar of johnnyt29
johnnyt29Flag for Canada

asked on

VMware ESXi 5.0 constantly resetting vmnic

I can't stay connected to my ESXi 5.0 host or its VM's, via either vSphere client or Windows RDP. The VM Kernel Log reports errors like:

1836 Netsched ... packets seem stuck... issuing reset of vmnic2
1817 ... scheduler [<some number>] lockup [stopped=0]...
1827 ... detected at 578999 while last xmit at 573549 and 27/20375

I did try to optimize the TCP stack (using TCP Optimizer 3 from http://www.speedguide.net/downloads.php) in one of my Windows VM's then did a host reboot (for other reasons) just before this problem appears. Could that have corrupted something or is it just a coincidence?

Tried removing all other devices from my router except ESXi host and a laptop to run vSphere client and RDP. Was able to shut down all VM's to keep them from causing or contributing to the problem. Tried adjusting router MTU (between 1492 and 1500) guessing that is was a packet fragmentation issue (although the router should be able to handle that I thought). Tried restarting all devices (host, router, laptop). Only thing I haven't done yet is re-installing ESXi OS on the USB thumb drive.
Avatar of Andrew Hancock (VMware vExpert PRO / EE Fellow/British Beekeeper)
Andrew Hancock (VMware vExpert PRO / EE Fellow/British Beekeeper)
Flag of United Kingdom of Great Britain and Northern Ireland image

what server is ESXi installed on?

are the nics on the hcl?
ASKER CERTIFIED SOLUTION
Avatar of Andrew Hancock (VMware vExpert PRO / EE Fellow/British Beekeeper)
Andrew Hancock (VMware vExpert PRO / EE Fellow/British Beekeeper)
Flag of United Kingdom of Great Britain and Northern Ireland image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of johnnyt29

ASKER

using motherboard NICs (tried using both NICs), which have been working fine since I got the server and are working fine for others using the exact same hardware (not on HCL - details available at http://tinkertry.com/vzilla)

by "errors on the switch" do you mean VMware or physical switch? - I removed all physical switches from the equation - both the vSphere client machine and host were connected directly to router.
after a while sitting there (no VM's running), I can't even ping my router anymore. I do a shutdown and get a PSOD. It restarts fine and am able to ping router fine.
physical switch, check port speeds, and duplex settings.

if you use a non verified and certified hardware this is the risk you take, with issues.
I understand the risk I'm taking and it's the reason I'm paying for experts-exchange.

As mentioned I removed all physical switches between vshere client (laptop) and router. Should/can I direct connect laptop NIC to host NIC?

What makes a NIC that was working stop working all of a sudden? barring a hardware failure (which can happen to hcl h/w too) I'm thinking there's a setting or something that got changed somehow that needs to be changed back to what it was before the problem. No?
under load NICs can fail on ESXi, that are not verified.

You may not find an answer on Experts Exchange for incompatible hardware.

You could try a crossover cable between server and laptop.
I moved to the other motherboard NIC and things are working now. Will have to boot up my Windows image and test the hardware - perhaps it just failed.

I had copied a large file earlier in the day no problem and had done so many times before too. I certainly wasn't loading up the NIC when I was troubleshooting and it was failing (using only ping, vshpere client or RDP, no VM's)

I do realize what I may / may not find on EE and also realize the risks of using hardware that isn't supported by VMware. There's really no need to point that out every time I ask a question that may involve my hardware.

Thanks.
Are your settings correct on the ESX host for DNS and routing? The vmkernel should use the same gateway as your physical network. Then check the vSwitch and make sure the NIC you want to communicate for the vSwitch is set as active and using the iprange the rest of the network is using.
Yep, they're fine. - I hadn't made any changes to those settings before the failure.

I ended up reinstalling ESXi on my thumb drive for that and another reason. I also installed an older PCI Intel 1000 Pro NIC that I'm testing right now (so far so good).

I tested the failed mb NIC via Windows 7 and it was fine (not a hardware problem, unless it's intermittent)

I'm going to try the mb NIC again after I test the separate NIC card for performance and stability. Right now, in the absence of other suggestions, I'm suspecting either a setting in my config or corruption of a file or something on the  thumb drive holding my ESXi install.