Link to home
Avatar of gopher_49
gopher_49

asked on

ESXi v5.5 communication issues with my iSCSI SAN - Dell md3200

My ESXi v5.5 Dell Update 3 server quit talking to my Dell md3200i iSCSI SAN.  This caused all VMs to stop responding...  The iSCSI SAN was deployed with best practice in regards to dedicated VLANs and vSwitches for each iSCSI port.  I'm using the  VMWare method for redundant pathing... Where do I start to troubleshoot what happened?  I downloaded the vmkernel logs.. I had errors pertaining to performance and not being able to talk to my iSCSI SAN.  Now I just see logs like below... When I see these things are working just fine.. But..  A day ago I had other logs pertaining to it not being cable to communicate with the SAN.  Should I maybe update the firmware on my Dell md3200i?

2017-03-06T20:52:37.456Z cpu6:34103)ScsiDeviceIO: 2363: Cmd(0x412e807f3b00) 0x4d, CmdSN 0x22 from world 34103 to dev "naa.6001c230d9b57b00164feeed1031d490" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x20 0x0.
2017-03-06T20:52:37.456Z cpu6:34103)ScsiDeviceIO: 2363: Cmd(0x412e807f3b00) 0x1a, CmdSN 0x23 from world 34103 to dev "naa.6001c230d9b57b00164feeed1031d490" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.
2017-03-06T20:59:27.835Z cpu6:32795)NMP: nmp_ThrottleLogForDevice:2349: Cmd 0x1a (0x412e80819800, 0) to dev "mpx.vmhba0:C0:T0:L0" on path "vmhba0:C0:T0:L0" Failed: H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x20 0x0. Act:NONE
2017-03-07T14:19:28.194Z cpu5:32794)WARNING: ScsiDeviceIO: 1223: Device naa.6842b2b00012d53300008bd957dbb540 performance has deteriorated. I/O latency increased from average value of 24108 microseconds to 602199 microseconds.
2017-03-07T14:19:29.808Z cpu1:36047)ScsiDeviceIO: 1203: Device naa.6842b2b00012d53300008bd957dbb540 performance has improved. I/O latency reduced from 602199 microseconds to 119635 microseconds.
2017-03-07T14:19:33.209Z cpu6:36250)ScsiDeviceIO: 1203: Device naa.6842b2b00012d53300008bd957dbb540 performance has improved. I/O latency reduced from 119635 microseconds to 47817 microseconds.
Avatar of Grant Berezan
Grant Berezan
Flag of Canada image

For intermittent communication issues with a SAN via iSCSI, I would be more inclined to look at the switches involved. Do you have paired switches , splitting your communication paths between vSwitches? Perhaps what you are seeing is a imminent failure in one switch, and when the traffic hits a certain load, it breaks. The latency messages are common, as the more traffic your SAN gets, the higher the I/O. Also, have you updated all your VMWare hosts to the latest patch version for 5.5?
Avatar of gopher_49
gopher_49

ASKER

My ESXi hosts are running Dell update 3.  I think that's the most current .  My switches seem to be fine and other host are not having issues .  Each iSCSI port is in its own VLAN and dedicated vswitch.  I think the md3200 firmware is old.  Also.  It's running raid 6 and one drive is failed
ASKER CERTIFIED SOLUTION
Avatar of Grant Berezan
Grant Berezan
Flag of Canada image

Blurred text
THIS SOLUTION IS ONLY AVAILABLE TO MEMBERS.
View this solution by signing up for a free trial.
Members can start a 7-Day free trial and enjoy unlimited access to the platform.
See Pricing Options
Start Free Trial
I'll post the vmkernel logs from the day I had issues.  I think they have what you're looking for.  Does the vmkernel log file store entries after a reboot?  Are those secure to post on here?
OrcaKnight,

I looked at the managed switch ports that the ESXi host and SAN are plugged into.  I don't see any errors on any of the switch ports.  Which ESXi log should I review to see if the NIC is having issues?
Not sure to exactly what was wrong but there was a communication issue between the NIC and the iSCSI network.

Thanks.