Link to home
Start Free TrialLog in
Avatar of pchettri
pchettri

asked on

APD Errors on ESX host after upgrade

Recently, I upgraded all my ESX to 6 to get the full VDP feature. A week after upgrade I started noticing VM issues and slowness on data discovery.

VMware support pointed out the logs .............."naa.60a98000383032754c3f4671786d6339" - failed to issue command due to Not found (APD), try again." on only two ISCSI LUN out of four.  They traced the log back couple of days before the upgrade, that makes it confusing if it happened due to upgrade or was already there. VMware said it needs to be addressed by Netapps.

Netapp took 1 week just to collect logs never had support person calling. Today there was another support person calling and ran Interoperability Matrix Tool. When we selected ONTAP 8.2 with ESX 6 the protocol for ISCSI was grayed and he speculated  that ESX 6 does not support ISCSI on ONTAP 8.2 7 node but it did not show any error or explanation. He said netapps has communicated  this bug fix to vmware and until the fix is released by VMware it wont work on ONTAP 8.2 and it should be up to vmware. The article on bug fix is vague and does not clearly explains, if it is ISCSI or could it fix if I change datastore to NFS.
 http://mysupport.netapp.com/NOW/cgi-bin/bugrellist?bugno=911844

Now VMware recommeds, if netapps has pointed out issue with 6 then I should rollback the version. But I am sure if rollback would really fix it. Plus I have already done clean install on one ESX host thinking how clean installation works in comparison to issues showed after upgrade.

Another, point I came across while doing clean installation on one ESX host to eliminate if the issue is with upgrade was virtual network setup. Looks like the remote support vendor had setup multi-homed network switch with two ISCSI  kernel storage switch (active and unused order NIC is different on both as active/ unused ) with same network subnet, one management network and vm network on same subnet different than storage subnet and one additional VLAN for voice using two 10 GB NIC cards. MTU on storage network is set to 9000.
I had some issues discovering storage ISCSI LUN after discovering adaptor. I had to create separate network switch for storage by using one of 10 G adapter and once the storage was discovered and changed the it back to same as the other virtual switches. This add another confusion, if it is something to do with multi homed setup, if this is the case why problem started showing after 6 months only at the time of upgrade. The log on clean installed still shows error on ISCSI storage network but not exactly the ADP error but does show bunch of error with ISCSI.

Has anyone experienced similar kind of issue with ESX upgrade using netapps FAS2554 ontap 7 mode
ASKER CERTIFIED SOLUTION
Avatar of Andrew Hancock (VMware vExpert PRO / EE Fellow/British Beekeeper)
Andrew Hancock (VMware vExpert PRO / EE Fellow/British Beekeeper)
Flag of United Kingdom of Great Britain and Northern Ireland image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of pchettri
pchettri

ASKER

I have exact recommended setup as suggested in your article and it was working for last 6 months until I upgraded. The only thing strange is vmware found logs a week before it was upgraded

I two NICS bonded together on in Vswitch which is also bonded with LACP in Brocade switch physically
 In Vswitch it has three VM kernel port.

Two VM kernel port with ISCSI and vmotion enabled and followed active and unused setup with each other and they are in same network subnet (172.0.0.0.)
One is standard vkernel (mangament network) with different IP subnet (10.0.0.0)
Vmnetwork with (10.0.0.0)

All of them created in single vswitch with two adapter. following article. ISCSI with same network is bound together and rest of them in different network. Only network that is common without bonding is vmnetwork and management network.


Configurations with more than one vmknic interface on the same IP subnet should be avoided, unless the vmknics on the same subnet are bound together using iSCSI port-binding or configured for Multi-NIC vMotion.
For more information about Multiple-NIC vMotion, see Multiple-NIC vMotion in vSphere 5 (2007467).
LACP is NOT supported on ESXi with Standard vSwitches.

We have  Data ONTAP 8.2.3 7-Mode       and  Data ONTAP 8.2.2 7-Mode, using with ESXi 6.0 with no issues.
Are you using ISCSI or NFS on ONTAP 8.2.3 7 mode? Netapp customer support determine only on the basis on this tool that I have attached. The only thing it does is shows ISCSI protocol as grayed out when I select ESX 6 with ONTAP 8.2.2
ontal.jpg
We are using iSCSI and NFS, but most of our clients, are now favouring NFS, and also based on NetApp FAS design, NFS is lower overhead, because there are less "stacks" than using iSCSI, especially now there is the VAAI plugins for NFS.

As the FAS filer were orignally developed around NFS, so less layers.
Hi Andrew,

Does NFS datastore support MTU 9000 transfer?
Yes, we use it with MTU 9000 (Jumbo Frames).

Jumbo Frames enabled on VMKernel Portgroups, Storage Switches, and we run two different VLANS (VIFS) into our Filer Heads, VLAN for iSCSI and VLAN for NFS.

So storage protocols, are isolated across the switches.

We used to use iSCSI with NetApp Filers for years, since 2004, when NFS licenses you had to pay for, and then we have been switching now to NFS.

Not many clients left now....
I was going through the article

HOW TO: Add an iSCSI Software Adaptor and Create an iSCSI Multipath Network in VMware vSphere Hypervisor ESXi 5.0

and I noticed a "Bind with VMkernel network adaptor".  I noticed one of my host does not have  NIC added to this dialog box. Would it be an issue for storage?
If you have not bound iSCSI to a network interface, the interface cannot be used for iSCSI traffic.

So if using multipath, it will not function as expected.