Avatar of gopher_49
gopher_49
 asked on

ESXi v5.5 communication issues with my iSCSI SAN - Dell md3200

My ESXi v5.5 Dell Update 3 server quit talking to my Dell md3200i iSCSI SAN.  This caused all VMs to stop responding...  The iSCSI SAN was deployed with best practice in regards to dedicated VLANs and vSwitches for each iSCSI port.  I'm using the  VMWare method for redundant pathing... Where do I start to troubleshoot what happened?  I downloaded the vmkernel logs.. I had errors pertaining to performance and not being able to talk to my iSCSI SAN.  Now I just see logs like below... When I see these things are working just fine.. But..  A day ago I had other logs pertaining to it not being cable to communicate with the SAN.  Should I maybe update the firmware on my Dell md3200i?

2017-03-06T20:52:37.456Z cpu6:34103)ScsiDeviceIO: 2363: Cmd(0x412e807f3b00) 0x4d, CmdSN 0x22 from world 34103 to dev "naa.6001c230d9b57b00164feeed1031d490" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x20 0x0.
2017-03-06T20:52:37.456Z cpu6:34103)ScsiDeviceIO: 2363: Cmd(0x412e807f3b00) 0x1a, CmdSN 0x23 from world 34103 to dev "naa.6001c230d9b57b00164feeed1031d490" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.
2017-03-06T20:59:27.835Z cpu6:32795)NMP: nmp_ThrottleLogForDevice:2349: Cmd 0x1a (0x412e80819800, 0) to dev "mpx.vmhba0:C0:T0:L0" on path "vmhba0:C0:T0:L0" Failed: H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x20 0x0. Act:NONE
2017-03-07T14:19:28.194Z cpu5:32794)WARNING: ScsiDeviceIO: 1223: Device naa.6842b2b00012d53300008bd957dbb540 performance has deteriorated. I/O latency increased from average value of 24108 microseconds to 602199 microseconds.
2017-03-07T14:19:29.808Z cpu1:36047)ScsiDeviceIO: 1203: Device naa.6842b2b00012d53300008bd957dbb540 performance has improved. I/O latency reduced from 602199 microseconds to 119635 microseconds.
2017-03-07T14:19:33.209Z cpu6:36250)ScsiDeviceIO: 1203: Device naa.6842b2b00012d53300008bd957dbb540 performance has improved. I/O latency reduced from 119635 microseconds to 47817 microseconds.
Dell* SANVMware

Avatar of undefined
Last Comment
gopher_49

8/22/2022 - Mon
Grant Berezan

For intermittent communication issues with a SAN via iSCSI, I would be more inclined to look at the switches involved. Do you have paired switches , splitting your communication paths between vSwitches? Perhaps what you are seeing is a imminent failure in one switch, and when the traffic hits a certain load, it breaks. The latency messages are common, as the more traffic your SAN gets, the higher the I/O. Also, have you updated all your VMWare hosts to the latest patch version for 5.5?
gopher_49

ASKER
My ESXi hosts are running Dell update 3.  I think that's the most current .  My switches seem to be fine and other host are not having issues .  Each iSCSI port is in its own VLAN and dedicated vswitch.  I think the md3200 firmware is old.  Also.  It's running raid 6 and one drive is failed
ASKER CERTIFIED SOLUTION
Grant Berezan

Log in or sign up to see answer
Become an EE member today7-DAY FREE TRIAL
Members can start a 7-Day Free trial then enjoy unlimited access to the platform
Sign up - Free for 7 days
or
Learn why we charge membership fees
We get it - no one likes a content blocker. Take one extra minute and find out why we block content.
Not exactly the question you had in mind?
Sign up for an EE membership and get your own personalized solution. With an EE membership, you can ask unlimited troubleshooting, research, or opinion questions.
ask a question
gopher_49

ASKER
I'll post the vmkernel logs from the day I had issues.  I think they have what you're looking for.  Does the vmkernel log file store entries after a reboot?  Are those secure to post on here?
Your help has saved me hundreds of hours of internet surfing.
fblack61
gopher_49

ASKER
OrcaKnight,

I looked at the managed switch ports that the ESXi host and SAN are plugged into.  I don't see any errors on any of the switch ports.  Which ESXi log should I review to see if the NIC is having issues?
gopher_49

ASKER
Not sure to exactly what was wrong but there was a communication issue between the NIC and the iSCSI network.

Thanks.