How to monitor and verify status of an ESXi vmnic

Greetings Experts,

I have an ESXi with a vmnic which was logged to be down for some period,
forcing the ESXi to crash.

The driver software for vmnic is current. Before checking cable and switch
I would like to test the Physical Network card of the ESXi since I suspect
that the problem is being cause by the network adapter.

This ESXi has currently a non-production VM running. Could you please
advise a software in order to monitor the operation of the physical network
adapter on subject ESXi (i.e. to check if there is any packet loss)?
mamelasAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
0
mamelasAuthor Commented:
Dear Andrew,

Thank you for your reply. Apologies but I am novice to vSphere
and much more to vSphere's CLI.

I want actually to monitor if my physical adapter goes down or losses packets.

Therefore are you referring to the following command?
~ # esxcli network nic vlan stats get -n vmnic0
VLAN 0
   Packets received: 118
   Packets sent: 77
0
mamelasAuthor Commented:
Dear Andrew,

I run the above command and I was able to see the Packets that were Sent/Received across the VLANs but nothing more.

How could I see if the physical adapter of the ESXi is losing packets or goes down during the day?
0
Ultimate Tool Kit for Technology Solution Provider

Broken down into practical pointers and step-by-step instructions, the IT Service Excellence Tool Kit delivers expert advice for technology solution providers. Get your free copy now.

Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
the other tool you could use is

[b]ethtool -S vmnic0[/b]

Open in new window


ethtool version 5
Usage:
ethtool DEVNAME Display standard information about device
        ethtool -s|--change DEVNAME     Change generic options
                [ speed 10|100|1000 ]
                [ duplex half|full ]
                [ port tp|aui|bnc|mii|fibre ]
                [ autoneg on|off ]
                [ phyad %%d ]
                [ xcvr internal|external ]
                [ wol p|u|m|b|a|g|s|d... ]
                [ sopass %%x:%%x:%%x:%%x:%%x:%%x ]
                [ msglvl %%d ]
        ethtool -a|--show-pause DEVNAME Show pause options
        ethtool -A|--pause DEVNAME      Set pause options
                [ autoneg on|off ]
                [ rx on|off ]
                [ tx on|off ]
        ethtool -c|--show-coalesce DEVNAME      Show coalesce options
        ethtool -C|--coalesce DEVNAME   Set coalesce options
                [adaptive-rx on|off]
                [adaptive-tx on|off]
                [rx-usecs N]
                [rx-frames N]
                [rx-usecs-irq N]
                [rx-frames-irq N]
                [tx-usecs N]
                [tx-frames N]
                [tx-usecs-irq N]
                [tx-frames-irq N]
                [stats-block-usecs N]
                [pkt-rate-low N]
                [rx-usecs-low N]
                [rx-frames-low N]
                [tx-usecs-low N]
                [tx-frames-low N]
                [pkt-rate-high N]
                [rx-usecs-high N]
                [rx-frames-high N]
                [tx-usecs-high N]
                [tx-frames-high N]
                [sample-interval N]
        ethtool -g|--show-ring DEVNAME  Query RX/TX ring parameters
        ethtool -G|--set-ring DEVNAME   Set RX/TX ring parameters
                [ rx N ]
                [ rx-mini N ]
                [ rx-jumbo N ]
                [ tx N ]
        ethtool -k|--show-offload DEVNAME       Get protocol offload information
        ethtool -K|--offload DEVNAME    Set protocol offload
                [ rx on|off ]
                [ tx on|off ]
                [ sg on|off ]
                [ tso on|off ]
                [ ufo on|off ]
                [ gso on|off ]
        ethtool -i|--driver DEVNAME     Show driver information
        ethtool -d|--register-dump DEVNAME      Do a register dump
        ethtool -e|--eeprom-dump DEVNAME        Do a EEPROM dump
                [ raw on|off ]
                [ offset N ]
                [ length N ]
        ethtool -E|--change-eeprom DEVNAME      Change bytes in device EEPROM
                [ magic N ]
                [ offset N ]
                [ value N ]
        ethtool -r|--negotiate DEVNAME  Restart N-WAY negotation
        ethtool -p|--identify DEVNAME   Show visible port identification (e.g. blinking)
               [ TIME-IN-SECONDS ]
        ethtool -t|--test DEVNAME       Execute adapter self test
               [ online | offline ]
        ethtool -S|--statistics DEVNAME Show adapter statistics
        ethtool -h|--help DEVNAME       Show this help

Open in new window


Example NIC statistics: for vmnic0 in my ESXi server

NIC statistics:
     rx_bytes: 5954451296
     rx_error_bytes: 0
     tx_bytes: 138125239605
     tx_error_bytes: 0
     rx_ucast_packets: 7796962
     rx_mcast_packets: 4315949
     rx_bcast_packets: 26374339
     tx_ucast_packets: 114457902
     tx_mcast_packets: 0
     tx_bcast_packets: 5470422
     tx_mac_errors: 0
     tx_carrier_errors: 0
     rx_crc_errors: 0
     rx_align_errors: 0
     tx_single_collisions: 0
     tx_multi_collisions: 0
     tx_deferred: 0
     tx_excess_collisions: 0
     tx_late_collisions: 0
     tx_total_collisions: 0
     rx_fragments: 0
     rx_jabbers: 0
     rx_undersize_packets: 0
     rx_oversize_packets: 0
     rx_64_byte_packets: 22497959
     rx_65_to_127_byte_packets: 9035416
     rx_128_to_255_byte_packets: 3987152
     rx_256_to_511_byte_packets: 794017
     rx_512_to_1023_byte_packets: 711051
     rx_1024_to_1522_byte_packets: 1461655
     rx_1523_to_9022_byte_packets: 0
     tx_64_byte_packets: 10588091
     tx_65_to_127_byte_packets: 9358678
     tx_128_to_255_byte_packets: 5884689
     tx_256_to_511_byte_packets: 3970463
     tx_512_to_1023_byte_packets: 1594021
     tx_1024_to_1522_byte_packets: 88532382
     tx_1523_to_9022_byte_packets: 0
     rx_xon_frames: 0
     rx_xoff_frames: 0
     tx_xon_frames: 0
     tx_xoff_frames: 0
     rx_mac_ctrl_frames: 0
     rx_filtered_packets: 0
     rx_ftq_discards: 0
     rx_discards: 0
     rx_fw_discards: 0
     [0] rx_packets: 38487253
     [0] rx_bytes: 5800502586
     [0] rx_errors: 0
     [0] tx_packets: 0
     [0] tx_bytes: 0
     [1] rx_packets: 0
     [1] rx_bytes: 0
     [1] rx_errors: 0
     [1] tx_packets: 0
     [1] tx_bytes: 0
     [2] rx_packets: 0
     [2] rx_bytes: 0
     [2] rx_errors: 0
     [2] tx_packets: 7068
     [2] tx_bytes: 424990
     [3] rx_packets: 0
     [3] rx_bytes: 0
     [3] rx_errors: 0
     [3] tx_packets: 0
     [3] tx_bytes: 0
     [4] rx_packets: 0
     [4] rx_bytes: 0
     [4] rx_errors: 0
     [4] tx_packets: 37388216
     [4] tx_bytes: 132658246619
     [5] rx_packets: 0
     [5] rx_bytes: 0
     [5] rx_errors: 0
     [5] tx_packets: 0
     [5] tx_bytes: 0
     [6] rx_packets: 0
     [6] rx_bytes: 0
     [6] rx_errors: 0
     [6] tx_packets: 0
     [6] tx_bytes: 0
     [7] rx_packets: 0
     [7] rx_bytes: 0
     [7] rx_errors: 0
     [7] tx_packets: 0
     [7] tx_bytes: 0
     [8] rx_packets: 0
     [8] rx_bytes: 0
     [8] rx_errors: 0
     [8] tx_packets: 0
     [8] tx_bytes: 0


this will show you more statistics, than you can shake a stick at!

if the nic goes down, it will be logged. (/var/logs)

if you are losing packets, check the physical switch side as well, and cross compare with the stats from the ESXi side. Also check you have correct duplex and speed settings, correct teaming policy, and configuration of switch is correct.
0
mamelasAuthor Commented:
So I run the provided command with results as follows:

Q1) Which is the value that I should look for?

Q2 My vmnic was logged as down and that's why I am searching for a way to track down
the operation of the physical network adapter.
I am currently running WinMTR on a VM that resides on the ESXi that had the vmnic down.

What other software or command could I run in order to verify if the network card
needs replacement or not?


NIC statistics:
     rx_packets: 24265003
     tx_packets: 29870440
     rx_bytes: 94326538311
     tx_bytes: 80321634969
     rx_errors: 0
     tx_errors: 0
     rx_dropped: 0
     tx_dropped: 0
     multicast: 901116
     collisions: 0
     rx_over_errors: 0
     rx_crc_errors: 0
     rx_frame_errors: 0
     rx_fifo_errors: 0
     rx_missed_errors: 0
     tx_aborted_errors: 0
     tx_carrier_errors: 0
     tx_fifo_errors: 0
     tx_heartbeat_errors: 0
     rx_pkts_nic: 24265012
     tx_pkts_nic: 28513704
     rx_bytes_nic: 94456990442
     tx_bytes_nic: 80390773582
     lsc_int: 19
     tx_busy: 0
     non_eop_descs: 0
     broadcast: 693443
     rx_no_buffer_count: 0
     tx_timeout_count: 0
     tx_restart_queue: 0
     rx_long_length_errors: 0
     rx_short_length_errors: 0
     tx_flow_control_xon: 0
     rx_flow_control_xon: 0
     tx_flow_control_xoff: 0
     rx_flow_control_xoff: 0
     rx_csum_offload_errors: 0
     rx_header_split: 0
     alloc_rx_page_failed: 0
     alloc_rx_buff_failed: 0
     rx_no_dma_resources: 0
     hw_rsc_aggregated: 0
     hw_rsc_flushed: 0
     fdir_match: 0
     fdir_miss: 0
     fdir_overflow: 0
     fcoe_bad_fccrc: 0
     fcoe_last_errors: 0
     rx_fcoe_dropped: 0
     rx_fcoe_packets: 0
     rx_fcoe_dwords: 0
     fcoe_noddp: 0
     fcoe_noddp_ext_buff: 0
     tx_fcoe_packets: 0
     tx_fcoe_dwords: 0
     os2bmc_rx_by_bmc: 0
     os2bmc_tx_by_bmc: 0
     os2bmc_tx_by_host: 0
     os2bmc_rx_by_host: 0
     tx_queue_0_packets: 8141800
     tx_queue_0_bytes: 9248195099
     tx_queue_1_packets: 14640428
     tx_queue_1_bytes: 63747742911
     tx_queue_2_packets: 7087530
     tx_queue_2_bytes: 7321763899
     tx_queue_3_packets: 682
     tx_queue_3_bytes: 3933060
     rx_queue_0_packets: 9448926
     rx_queue_0_bytes: 24097958297
     rx_queue_1_packets: 9833255
     rx_queue_1_bytes: 35504711793
     rx_queue_2_packets: 4982822
     rx_queue_2_bytes: 34723868221
     rx_queue_3_packets: 0
     rx_queue_3_bytes: 0
     tx_pb_0_pxon: 0
     tx_pb_0_pxoff: 0
     tx_pb_1_pxon: 0
     tx_pb_1_pxoff: 0
     tx_pb_2_pxon: 0
     tx_pb_2_pxoff: 0
     tx_pb_3_pxon: 0
     tx_pb_3_pxoff: 0
     tx_pb_4_pxon: 0
     tx_pb_4_pxoff: 0
     tx_pb_5_pxon: 0
     tx_pb_5_pxoff: 0
     tx_pb_6_pxon: 0
     tx_pb_6_pxoff: 0
     tx_pb_7_pxon: 0
     tx_pb_7_pxoff: 0
     rx_pb_0_pxon: 0
     rx_pb_0_pxoff: 0
     rx_pb_1_pxon: 0
     rx_pb_1_pxoff: 0
     rx_pb_2_pxon: 0
     rx_pb_2_pxoff: 0
     rx_pb_3_pxon: 0
     rx_pb_3_pxoff: 0
     rx_pb_4_pxon: 0
     rx_pb_4_pxoff: 0
     rx_pb_5_pxon: 0
     rx_pb_5_pxoff: 0
     rx_pb_6_pxon: 0
     rx_pb_6_pxoff: 0
     rx_pb_7_pxon: 0
     rx_pb_7_pxoff: 0
0
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
I don't see any errors, dropped and error counters. I would look at.

As for your nic, which is shown as down, I would check

1. physical switch configuration.
2. network cable.
3. ESXi configuration settings, speed and duplex, make sure these match the switch.
4. Firmware (Update Firmware from Vendor).
5. Check NIC is on the VMware HCL.
6. Check server is on the VMware HCL.
7. Replace NIC.

So what server, make and model, firmware are you using with the server, what version of ESXi, build and patch number.

ethtool can provide tests, but it will be quicker, to get a different network interface. if you suspect it, but firmware  can be the cause, and configuration.

Vendor based diags to check the network interface and hardware, there are no other tools in ESXi OS.
0
mamelasAuthor Commented:
If we focus on the Physical Network Card and since I have to prove to the manufacturerthat there is a problem and should be replaced what steps you would follow?

1)Check Cable
2)Check Firmware (which was confirmed by the VMware's Support Engineer that is current)
Anything else?
0
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
I would go through my steps 1-7 above.

do you have more than a single port or nic ?
0
mamelasAuthor Commented:
I have 2 ESXis.

ESXi-1 and ESXi-2.
Each ESXi has only one physical adapter with 2 network ports.

Currently I have vmotioned all the VMs to ESXi-1, since there were logs
on the ESXi-2 that the vmnic was down.
0
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
You have tried both network ports, and both are the same ?
0
mamelasAuthor Commented:
Dear Andrew,

No I have not. But since this ESXi is a production one I wanted to keep the port and cable as is and try to capture the incident once it will be regenerated.

Then I will change port and cable.

I am trying to find a software for capturing the network traffic generated between my ESXi and Switch.
I am currently using the WinMTR that send 1000bytes packets from a VM (that resides to subject ESXi) to the network switch.

I am wondering, will be enough that one to help me verifying network card's state?
0
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
is the port flapping on and off ?
0
mamelasAuthor Commented:
Absolutely yes! The VMware Support Engineer who has accessed my ESXi, told me that the Port/Network Card was flapping which cause the ESXi to freeze and consequently all the VMs inside him.

He found also from the logs that the Network card was also flapping one day before.

We therefore vMotion all the VM to the second ESXi till I find the cause of the error.

We also restarted the Frozen ESXi and since today I cannot see any NEW flapping from the logs...

Any suggestion??
0
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
flapping is caused by

1. faulty network card

2. incorrect configuration of ports and switches.

3. network cable.

4. incorrect firmware and driver.
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
mamelasAuthor Commented:
Thanks Andrew,

And how could I verify if the Network Card is faulty?
0
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
Substitution
0
mamelasAuthor Commented:
You mean to replace the network card with new one?

If yes and since the server is under guarantee I have to prove to the manufacturer that the card is  faulty in order to be entitled for a new one.

Any software that could prove the above mentioned?
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
VMware

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.