Link to home
Start Free TrialLog in
Avatar of danbrown_
danbrown_Flag for United States of America

asked on

Multipathing in VMware kills connection to datastores

Hello Experts - I recently upgraded to ESX/Vcenter 5.1 and I'm trying to add a second NIC to my iSCSI vswitch in VMware.  Whenever I enable the second NIC within five minutes I lose connectivity to three of my six datastores.  My environment consists of three ESX 5.1 hosts, Vcenter 5.1, a Netapp FAS2240-4 SAN, and an HP 2920 switch.  After speaking with support engineers from both companies I believe the problem relates to load balancing.  I have not been able to determine the answer to certain questions:

1) Which SATP software should VMware be using to connect to the Netapp?  Right now its showing up as VMW_SATP_DEFAULT_AA which I am told is generic software when VMware doesn't know what kind of SAN is on the other end.

2) Is the FAS2240-4 able to use ALUA (I don't believe it is capable)

3) What Load Balancing method should I be using and where should it be set?

For number three I have been told different things by different techs  Initially we configured the VMware datastores to use round robin and also set the ifgroups on the Netapp to use round robin but I was told that this should only be set on the VMware side, not the Netapp.  The Netapp tech that assisted in the initial setup seemed to think both the Netapp and Datastores needed to be setup to use round robin so that is how we did it.  As much as I'd like to just try disabling load balancing on the Netapp it appears that the ifgroups can't be modified once they are created...true?

I've got so much conflicting info here that I'd like to try and get a consensus on what the actual best practices are for my particular configuration.  I'd really appreciate any advice on how to get this going.  I've been through 4 VMware techs so far without any resolution.
Avatar of Andrew Hancock (VMware vExpert PRO / EE Fellow/British Beekeeper)
Andrew Hancock (VMware vExpert PRO / EE Fellow/British Beekeeper)
Flag of United Kingdom of Great Britain and Northern Ireland image

Round Robin, with single IP Address on the NetApp iSCSI VIF

and multi path setup as per my EE Article

HOW TO: Add an iSCSI Software Adaptor and Create an iSCSI Multipath Network in VMware vSphere Hypervisor ESXi 5.0

and Jumbo Frames

HOW TO: Enable Jumbo Frames on a VMware vSphere Hypervisor (ESXi 5.0) host server using the VMware vSphere Client

that how we have our files configured.

Also install Virtual Storage Console because it will configure multipathing and iSCSI settings for you to NetApp preferred values.
Personally, I would not use load balancing.  I would have multipathing for failover, but I'd turn off round-robin.

One of our techs went to VMworld a few years ago and came back with that recommendation.
Avatar of danbrown_

ASKER

The VMware tech recommended using the Fixed Path Selection but all of my ifgroups on the Netapp are set to use round robin.  How can I change this on the Netapp without recreating the ifgroups?
You will need to re-create the ifgroups and IP Address.

e.g. rdfile /etc/rc

and wdfile /etc/rc

or use Oncommand to edit the network interfaces.
@Andrew - I went through your article and that is exactly how we had things configured in VMware and on the Netapp.  The problem is when doing that I lose connectivity to the datastores.
do you have a single IP specified for iSCSI connections ?

do you have singe IP Address on the NetApp VIF (igrp)

did you apply the recommended values for iSCSI, or via Storage Console for MPIO ?
do you have a single IP specified for iSCSI connections ?
Here is how my vSwitch looks in VMware (one NIC is disable due to the problem:
User generated image

 do you have singe IP Address on the NetApp VIF (igrp)
Yes, here is a screenshot of the config
User generated imageUser generated image
 did you apply the recommended values for iSCSI, or via Storage Console for MPIO ?
You recommended using Round Robin, VMware says to use Fixed.  Using Round Robin I'm losing connectivity to the datastores.
You have TWO IP Addresses specified for iSCSI on the filer!

We do not do this, we use one for iSCSI (and two for NFS)

and we ensure that the IP Address is trunked across all four ports on the Filer via LACP on the physical switches.

also make sure that your physical network ports, are standard ports, not trunked, LACP etc

We chose to apply configuration via Storage Console, because it alert if not correct, apply, and then reboot server.
OK, let me restate this so I am sure I understand your recommendation.  I should remove the existing ifgroups from both Netapp heads.  I should then create just one ifgroup containing all four NICs, one for the top controller and one for the bottom controller.

I don't understand your next recommendation.  First you say use LACP for the filer ports on the physical switch (HP 2920).  The next line says make sure the ports are standard and not trunked or using LACP...which is correct?
Filer

Trunk Four Physical Ports (if you have LACP use it!)

Hosts (ESXi)

Standard Access Ports 2 of, no trunk, no LACP

Setup as per my Article.

Work with a single Controller first,

Make sure Partner Addresses are specified for correct take over and give back of both controllers.
It turned out most of this was not necessary.  Here is a screenshot of how my vSwitches are configured now:

User generated image
And the Netapp:
netapp.jpg
I've requested that this question be closed as follows:

Accepted answer: 0 points for danbrown_'s comment #a40355179

for the following reason:

Found own solution
ASKER CERTIFIED SOLUTION
Avatar of Andrew Hancock (VMware vExpert PRO / EE Fellow/British Beekeeper)
Andrew Hancock (VMware vExpert PRO / EE Fellow/British Beekeeper)
Flag of United Kingdom of Great Britain and Northern Ireland image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Except my setup does not work when configured using best practices, this at least works.
Your configuration must be incorrect, you only need a single IP Address for controller. (not two!)

and I'm actually going to object, because you have trunked ALL four nics on the NetApp as per my post.

Have you had your configuration of the NetApp and VMware "runner stamped" by Professional Services, e.g. NetApp Vendor Engineers ?

or did you create it yourself ?
I actually left the VIFs as they were but changed the IP address on two of them (one top and one bottom).  However you've always been a real help in the past so if you want the points they're all yours!  Thanks for all of your help, very much appreciated.
My setup did not work in this configuration but Andrew has described the accepted best practice in detail which is how the initial setup should be configured.
Dan

Unless you have a reason to use iSCSI, NetApp recommends NFS and favours NFS these days, NFS has less overhead and performs better now than iSCSI!

NetApp Filers were originally built around NFS, and now we have the VAAI NFS Plugin from NetApp we can do Thick on NFS, and datamoves based on snapshots, so clone/copies are faster and handled by the NetApp Filer, and not the ESXi server!

It's worth considering, as you do not have to worry about LUNs, LUN Reserves, Snapshot Reserves etc

for VMware vSphere

We are moving Customers from iSCSI to NFS on NetApp!
Tempting, but after all the time it took to get setup using iSCSI (which the Netapp engineer who assisted with the setup actually recommended) I'm going to stick with it for now.  Maybe a slow transition over to NFS as things get migrated since it supports both.  Thanks again Andrew!
If you've got licenses for both and time try it!

I'm always here to help!!!

All the best

Andy