Avatar of Luca
Luca

asked on 

Massive iSCSI Problems Event ID: 24, 7, 153

Hello Board

We currently have an issue that i don't get resolved since 3 Weeks.
Enviroment:

Physical Server
HPE DL380 Gen10 with 1x 2-port 10Gbe-HPE562SFP+ and 1x 2-port 10Gbe-HPE562FLRSFP+ attached via compatible HPE DAC Cable to a switch
OS: Windows Server 2016 Std
Software: Veeam Backup and Replication 9.5
Network Configuration: 1x LACP Windows-Team (1 Member of each networkcard-port) with management network, 1x LACP x LACP Windows-Team (1 Member of each networkcard-port) with storage network. All interfaces have jumboframes configured

Storage Array:
HPE Nimble Storage
Network Configuration: 2x 10Gbe Port with storage network configured

Issue:
During backup the nimble array creates a storage snapshot of the volumes that have to be backuped. Veeam then copies the storage-snapshot-data to the local repository. During backup-copy from a particular storage-snapshot we do have a ton of the following windows events:
  • Target sent an iSCSI PDU with an invalid opcode. Dump data contains the entire iSCSI header. - Error - EventID 24
  • The initiator could not send an iSCSI PDU. Error status is given in the dump data. - Error - EventID 7
  • The initiator could not send an iSCSI PDU. Error status is given in the dump data. -Warning - Event ID 153 --> i assume this happens because of the iscsi lost

What have been done so far:

Checked Jumbo Frames
Ping with packetsize of 8972 bytes (jumbo frames) to our storage arrays works without fragmentation --> i assume that the configuration is set as it should

ping -f -l 8972 xxx.xx.xx.xx

Pinging xxx.xx.xx.xx with 8972 bytes of data:
Reply from xxx.xx.xx.xx: bytes=8972 time<1ms TTL=128
Reply from xxx.xx.xx.xx: bytes=8972 time<1ms TTL=128
Reply from xxx.xx.xx.xx: bytes=8972 time<1ms TTL=128
Reply from xxx.xx.xx.xx: bytes=8972 time<1ms TTL=128

Ping statistics for xxx.xx.xx.xx:
    Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
    Minimum = 0ms, Maximum = 0ms, Average = 0ms

Hardware
  • Replaced both 10Gbe Adapters in the server
  • Firmware and Driver upgrade to the latest version
  • Changed Switchports to be sure its not a missconfiguration or a bug..

Software
  • Tried if the problem is caused by the windows teaming - disabled each teammember (on the storage team interface)
  • Destroyed storage team and tried with just each standalone interface
  • Uninstalled Nimble Connection Manager Software that configures iscsi with nimble best practice


Opened many support tickets
  • Veeam
  • Microsoft
  • Nimble Storage
  • HPE

Maybe you had a similar problem with the same or similiar enviroment.
Regards
Luco
VirtualizationWindows 10AzureVeeamWindows Server 2016

Avatar of undefined
Last Comment
Luca

8/22/2022 - Mon