Kelly Garcia
asked on
VMware lost connectivity to datastore
Hi,
We lost connectivity to a ISCSI datastore last week and I need to go through to the log files to ascertain the root cause, which files should I check?
Thank you in advance.
Regards,
Kelly
We lost connectivity to a ISCSI datastore last week and I need to go through to the log files to ascertain the root cause, which files should I check?
Thank you in advance.
Regards,
Kelly
Just a single datastore or all datastores ?
how did it come back ?
how is the networking configured ?
did you lose network ?
how did it come back ?
how is the networking configured ?
did you lose network ?
ASKER
The network is fine, and I am not sure the host loosing access to all datastores, one of the SAN nodes did go down however that san node (HP lefthand) is in cluster and therefore all the volumes were running fine from the other nodes, therefore we can't understand why we've had to rescan. This happened on a few occasions.
ASKER
The failure happened approximately at 16:00, I have checked the vmkwanring Log and we have these warning - please can someone help me understand these logs :)
2015-05-12T15:02:36.876Z cpu10:12127002)WARNING: VSCSI: 3481: handle 9200(vscsi0:1):WaitForCIF: Issuing reset; number of CIF:7
2015-05-12T15:02:39.401Z cpu16:11721872)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237:NMP device "naa.6000eb3ee3f726c90000000000000212" state in doubt; requested fast path state update...
2015-05-12T15:02:40.458Z cpu16:11721872)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237:NMP device "naa.6000eb3ee3f726c900000000000001aa" state in doubt; requested fast path state update...
2015-05-12T15:02:40.458Z cpu6:8235)WARNING: HBX: 2829: Reclaiming timed out [HB state abcdef02 offset 3178496 gen 1223 stampUS 16236889309445 uuid 545a54d9-759185a8-5801-f0921c0bb818 jrnl <FB 129348> drv 14.58] on vol 'EDRM-PROD-01' failed: IO wa [0$
2015-05-12T15:02:43.460Z cpu16:11721872)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237:NMP device "naa.6000eb3ee3f726c900000000000001e0" state in doubt; requested fast path state update...
2015-05-12T15:02:43.460Z cpu2:8237)WARNING: HBX: 2829: Reclaiming timed out [HB state abcdef02 offset 3178496 gen 57 stampUS 16236892309432 uuid 545a54d9-759185a8-5801-f0921c0bb818 jrnl <FB 129752> drv 14.58] on vol 'EDRM-PROD-04' failed: IO was [0$
2015-05-12T15:02:44.408Z cpu16:11721872)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237:NMP device "naa.6000eb3ee3f726c900000000000001bc" state in doubt; requested fast path state update...
2015-05-12T15:02:44.421Z cpu16:11721872)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237:NMP device "naa.6000eb3ee3f726c9000000000000029b" state in doubt; requested fast path state update...
2015-05-12T15:02:48.406Z cpu16:11721872)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237:NMP device "naa.6000eb3ee3f726c900000000000001bc" state in doubt; requested fast path state update...
2015-05-12T15:03:05.451Z cpu16:11721872)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237:NMP device "naa.6000eb3ee3f726c9000000000000007a" state in doubt; requested fast path state update...
2015-05-12T15:03:05.451Z cpu19:8236)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237:NMP device "naa.6000eb3ee3f726c9000000000000029b" state in doubt; requested fast path state update...
ASKER
The Vmkernal log files displays some errors, please can someoneone help me understand whats going on :(
[/
2015-05-12T15:02:40.458Z cpu16:11721872)ScsiDeviceIO: 2318: Cmd(0x4124057b17c0) 0x28, CmdSN 0x11854d6 from world 8235 to dev "naa.6000eb3ee3f726c900000000000001aa" failed H:0x5 D:0x0 P:0x0 Possible sense data: 0x5 0x20 0x0.
2015-05-12T15:02:40.458Z cpu6:8235)HBX: 555: Reading HB at 3178496 on vol 'EDRM-PROD-01' failed: IO was aborted
2015-05-12T15:02:40.458Z cpu6:8235)WARNING: HBX: 2829: Reclaiming timed out [HB state abcdef02 offset 3178496 gen 1223 stampUS 16236889309445 uuid 545a54d9-759185a8-5801-f0921c0bb818 jrnl <FB 129348> drv 14.58] on vol 'EDRM-PROD-01' failed: IO wa [0$
2015-05-12T15:02:40.510Z cpu11:8309)HBX: 2441: Waiting for timed out [HB state abcdef02 offset 3178496 gen 1223 stampUS 16236889309445 uuid 545a54d9-759185a8-5801-f0921c0bb818 jrnl <FB 129348> drv 14.58] on vol 'EDRM-PROD-01'
2015-05-12T15:02:43.460Z cpu16:11721872)NMP: nmp_ThrottleLogForDevice:2319: Cmd 0x28 (0x412404464140, 8237) to dev "naa.6000eb3ee3f726c900000000000001e0" on path "vmhba33:C1:T5:L0" Failed: H:0x5 D:0x0 P:0x0 Possible sense data: 0x5 0x20 0x0. Act:EVAL
2015-05-12T15:02:43.460Z cpu16:11721872)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237:NMP device "naa.6000eb3ee3f726c900000000000001e0" state in doubt; requested fast path state update...
2015-05-12T15:02:43.460Z cpu16:11721872)ScsiDeviceIO: 2318: Cmd(0x412404464140) 0x28, CmdSN 0xa4f94e from world 8237 to dev "naa.6000eb3ee3f726c900000000000001e0" failed H:0x5 D:0x0 P:0x0 Possible sense data: 0x5 0x20 0x0.
2015-05-12T15:02:43.460Z cpu2:8237)HBX: 555: Reading HB at 3178496 on vol 'EDRM-PROD-04' failed: IO was aborted
2015-05-12T15:02:43.460Z cpu2:8237)WARNING: HBX: 2829: Reclaiming timed out [HB state abcdef02 offset 3178496 gen 57 stampUS 16236892309432 uuid 545a54d9-759185a8-5801-f0921c0bb818 jrnl <FB 129752> drv 14.58] on vol 'EDRM-PROD-04' failed: IO was [0$
2015-05-12T15:02:44.408Z cpu16:11721872)NMP: nmp_ThrottleLogForDevice:2319: Cmd 0x2a (0x412405439d80, 8224) to dev "naa.6000eb3ee3f726c900000000000001bc" on path "vmhba33:C0:T3:L0" Failed: H:0x5 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0. Act:EVAL
2015-05-12T15:02:44.408Z cpu16:11721872)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237:NMP device "naa.6000eb3ee3f726c900000000000001bc" state in doubt; requested fast path state update...
2015-05-12T15:02:44.408Z cpu16:11721872)ScsiDeviceIO: 2318: Cmd(0x412405439d80) 0x2a, CmdSN 0x7cd7cd from world 8224 to dev "naa.6000eb3ee3f726c900000000000001bc" failed H:0x5 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.
2015-05-12T15:02:44.421Z cpu16:11721872)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237:NMP device "naa.6000eb3ee3f726c9000000000000029b" state in doubt; requested fast path state update...
2015-05-12T15:02:48.406Z cpu16:11721872)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237:NMP device "naa.6000eb3ee3f726c900000000000001bc" state in doubt; requested fast path state update...
2015-05-12T15:02:48.406Z cpu16:11721872)ScsiDeviceIO: 2318: Cmd(0x412445321880) 0x2a, CmdSN 0x7cd7ce from world 8224 to dev "naa.6000eb3ee3f726c900000000000001bc" failed H:0x5 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.
2015-05-12T15:03:05.451Z cpu16:11721872)NMP: nmp_ThrottleLogForDevice:2319: Cmd 0x28 (0x412442178bc0, 8236) to dev "naa.6000eb3ee3f726c9000000000000007a" on path "vmhba33:C0:T0:L0" Failed: H:0x5 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0. Act:EVAL
2015-05-12T15:03:05.451Z cpu16:11721872)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237:NMP device "naa.6000eb3ee3f726c9000000000000007a" state in doubt; requested fast path state update...
2015-05-12T15:03:05.451Z cpu16:11721872)ScsiDeviceIO: 2318: Cmd(0x412442178bc0) 0x28, CmdSN 0x1635ab2 from world 8236 to dev "naa.6000eb3ee3f726c9000000000000007a" failed H:0x5 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.
2015-05-12T15:03:05.451Z cpu19:8236)HBX: 555: Reading HB at 3179008 on vol 'EDRM-SYSLOG' failed: IO was aborted
2015-05-12T15:03:05.451Z cpu19:8236)NMP: nmp_ThrottleLogForDevice:2319: Cmd 0x2a (0x41240187e3c0, 9155) to dev "naa.6000eb3ee3f726c9000000000000029b" on path "vmhba33:C1:T7:L0" Failed: H:0x8 D:0x0 P:0x0 Possible sense data: 0x5 0x20 0x0. Act:EVAL
2015-05-12T15:03:05.451Z cpu19:8236)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237:NMP device "naa.6000eb3ee3f726c9000000000000029b" state in doubt; requested fast path state update...
2015-05-12T15:03:05.451Z cpu19:8236)ScsiDeviceIO: 2300: Cmd(0x41240187e3c0) 0x2a, CmdSN 0xc1aa61 from world 9155 to dev "naa.6000eb3ee3f726c9000000000000029b" failed H:0x8 D:0x0 P:0x0
look like it's having issues with the LUN Path or LUN is missing.
I would look a the logs on the HP LeftHand SAN.
I would look a the logs on the HP LeftHand SAN.
I'd doublecheck the storage switches as well, just in case ... But yes, seems the LeftHand SAN needs some TLC?
ASKER
I have these errors on vmkernal.log from esx02 :
2015-05-12T14:59:10.507Z cpu1:15304547)WARNING: ScsiPath: 4624: Plugin 'NMP' had an error (Failure) while claiming path 'vmhba33:C0:T1:L195'. Skipping the path.
2015-05-12T14:59:10.507Z cpu1:15304547)ScsiClaimrule: 1329: Plugin NMP specified by claimrule 65535 was not able to claim path vmhba33:C0:T1:L195. Busy
2015-05-12T14:59:10.507Z cpu1:15304547)ScsiClaimrule: 1554: Error claiming path vmhba33:C0:T1:L195. Busy.
2015-05-12T14:59:10.507Z cpu1:15304547)WARNING: NMP: nmp_PspClaimPath:144:Claim of path 'vmhba33:C0:T1:L196' by plugin VMW_PSP_RR for device 'naa.6000eb3ee3f726c900000000000000b2' failed. Failure
2015-05-12T14:59:10.507Z cpu1:15304547)WARNING: ScsiPath: 4624: Plugin 'NMP' had an error (Failure) while claiming path 'vmhba33:C0:T1:L196'. Skipping the path.
2015-05-12T14:59:10.507Z cpu1:15304547)ScsiClaimrule: 1329: Plugin NMP specified by claimrule 65535 was not able to claim path vmhba33:C0:T1:L196. Busy
2015-05-12T14:59:10.507Z cpu1:15304547)ScsiClaimrule: 1554: Error claiming path vmhba33:C0:T1:L196. Busy.
2015-05-12T14:59:10.508Z cpu1:15304547)WARNING: NMP: nmp_PspClaimPath:144:Claim of path 'vmhba33:C0:T1:L197' by plugin VMW_PSP_RR for device 'naa.6000eb3ee3f726c900000000000000b2' failed. Failure
2015-05-12T14:59:10.508Z cpu1:15304547)WARNING: ScsiPath: 4624: Plugin 'NMP' had an error (Failure) while claiming path 'vmhba33:C0:T1:L197'. Skipping the path.
2015-05-12T14:59:10.508Z cpu1:15304547)ScsiClaimrule: 1329: Plugin NMP specified by claimrule 65535 was not able to claim path vmhba33:C0:T1:L197. Busy
2015-05-12T14:59:10.508Z cpu1:15304547)ScsiClaimrule: 1554: Error claiming path vmhba33:C0:T1:L197. Busy.
2015-05-12T14:59:10.508Z cpu1:15304547)WARNING: NMP: nmp_PspClaimPath:144:Claim of path 'vmhba33:C0:T1:L198' by plugin VMW_PSP_RR for device 'naa.6000eb3ee3f726c900000000000000b2' failed. Failure
2015-05-12T14:59:10.508Z cpu1:15304547)WARNING: ScsiPath: 4624: Plugin 'NMP' had an error (Failure) while claiming path 'vmhba33:C0:T1:L198'. Skipping the path.
2015-05-12T14:59:10.508Z cpu1:15304547)ScsiClaimrule: 1329: Plugin NMP specified by claimrule 65535 was not able to claim path vmhba33:C0:T1:L198. Busy
2015-05-12T14:59:10.508Z cpu1:15304547)ScsiClaimrule: 1554: Error claiming path vmhba33:C0:T1:L198. Busy.
2015-05-12T14:59:10.508Z cpu1:15304547)WARNING: NMP: nmp_PspClaimPath:144:Claim of path 'vmhba33:C0:T1:L199' by plugin VMW_PSP_RR for device 'naa.6000eb3ee3f726c900000000000000b2' failed. Failure
2015-05-12T14:59:10.508Z cpu1:15304547)WARNING: ScsiPath: 4624: Plugin 'NMP' had an error (Failure) while claiming path 'vmhba33:C0:T1:L199'. Skipping the path.
2015-05-12T14:59:10.508Z cpu1:15304547)ScsiClaimrule: 1329: Plugin NMP specified by claimrule 65535 was not able to claim path vmhba33:C0:T1:L199. Busy
2015-05-12T14:59:10.508Z cpu1:15304547)ScsiClaimrule: 1554: Error claiming path vmhba33:C0:T1:L199. Busy.
do you have multipathing configured ?
ASKER
yes we have round robin configured.
ASKER
Hi,
Please I need help with this !
Thank you in advance,
Kelly
Please I need help with this !
Thank you in advance,
Kelly
is the datastore still missing ?
if you check all paths now, are they all active?
or do you have any dead paths.
if you check all paths now, are they all active?
or do you have any dead paths.
ASKER
Yes I have a dead path, and It should have automatically resolved it self, but it's been like that for days. In the vmkernal event logs I get this error:
2015-05-12T14:59:10.508Z cpu1:15304547)WARNING: NMP: nmp_PspClaimPath:144:Claim of path 'vmhba33:C0:T1:L197' by plugin VMW_PSP_RR for device 'naa.6000eb3ee3f726c900000 000000000b 2' failed. Failure
2015-05-12T14:59:10.508Z cpu1:15304547)WARNING: NMP: nmp_PspClaimPath:144:Claim
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
If I manually rescan this would solve the problem, however the problem is that we shouldn't have to do this, it should automatically rescan the paths.
-- /var/log/vmkernel.log: Core VMkernel logs, including device discovery, storage and networking device and driver events, and virtual machine startup.
-- /var/log/vmkwarning.log: A summary of Warning and Alert log messages excerpted from the VMkernel logs.
But, I'd definitely check some other logs as well, like syslog and vobd maybe.
More info on logs here