Multipathing in VMware kills connection to datastores

Hello Experts - I recently upgraded to ESX/Vcenter 5.1 and I'm trying to add a second NIC to my iSCSI vswitch in VMware.  Whenever I enable the second NIC within five minutes I lose connectivity to three of my six datastores.  My environment consists of three ESX 5.1 hosts, Vcenter 5.1, a Netapp FAS2240-4 SAN, and an HP 2920 switch.  After speaking with support engineers from both companies I believe the problem relates to load balancing.  I have not been able to determine the answer to certain questions:

1) Which SATP software should VMware be using to connect to the Netapp?  Right now its showing up as VMW_SATP_DEFAULT_AA which I am told is generic software when VMware doesn't know what kind of SAN is on the other end.

2) Is the FAS2240-4 able to use ALUA (I don't believe it is capable)

3) What Load Balancing method should I be using and where should it be set?

For number three I have been told different things by different techs  Initially we configured the VMware datastores to use round robin and also set the ifgroups on the Netapp to use round robin but I was told that this should only be set on the VMware side, not the Netapp.  The Netapp tech that assisted in the initial setup seemed to think both the Netapp and Datastores needed to be setup to use round robin so that is how we did it.  As much as I'd like to just try disabling load balancing on the Netapp it appears that the ifgroups can't be modified once they are created...true?

I've got so much conflicting info here that I'd like to try and get a consensus on what the actual best practices are for my particular configuration.  I'd really appreciate any advice on how to get this going.  I've been through 4 VMware techs so far without any resolution.
danbrown_IT ManagerAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
Round Robin, with single IP Address on the NetApp iSCSI VIF

and multi path setup as per my EE Article

HOW TO: Add an iSCSI Software Adaptor and Create an iSCSI Multipath Network in VMware vSphere Hypervisor ESXi 5.0

and Jumbo Frames

HOW TO: Enable Jumbo Frames on a VMware vSphere Hypervisor (ESXi 5.0) host server using the VMware vSphere Client

that how we have our files configured.

Also install Virtual Storage Console because it will configure multipathing and iSCSI settings for you to NetApp preferred values.
0
asavenerCommented:
Personally, I would not use load balancing.  I would have multipathing for failover, but I'd turn off round-robin.

One of our techs went to VMworld a few years ago and came back with that recommendation.
0
danbrown_IT ManagerAuthor Commented:
The VMware tech recommended using the Fixed Path Selection but all of my ifgroups on the Netapp are set to use round robin.  How can I change this on the Netapp without recreating the ifgroups?
0
Powerful Yet Easy-to-Use Network Monitoring

Identify excessive bandwidth utilization or unexpected application traffic with SolarWinds Bandwidth Analyzer Pack.

Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
You will need to re-create the ifgroups and IP Address.

e.g. rdfile /etc/rc

and wdfile /etc/rc

or use Oncommand to edit the network interfaces.
0
danbrown_IT ManagerAuthor Commented:
@Andrew - I went through your article and that is exactly how we had things configured in VMware and on the Netapp.  The problem is when doing that I lose connectivity to the datastores.
0
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
do you have a single IP specified for iSCSI connections ?

do you have singe IP Address on the NetApp VIF (igrp)

did you apply the recommended values for iSCSI, or via Storage Console for MPIO ?
0
danbrown_IT ManagerAuthor Commented:
do you have a single IP specified for iSCSI connections ?
Here is how my vSwitch looks in VMware (one NIC is disable due to the problem:
vswitch

 do you have singe IP Address on the NetApp VIF (igrp)
Yes, here is a screenshot of the config
ifgroup 1-2ifgroup 3-4
 did you apply the recommended values for iSCSI, or via Storage Console for MPIO ?
You recommended using Round Robin, VMware says to use Fixed.  Using Round Robin I'm losing connectivity to the datastores.
0
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
You have TWO IP Addresses specified for iSCSI on the filer!

We do not do this, we use one for iSCSI (and two for NFS)

and we ensure that the IP Address is trunked across all four ports on the Filer via LACP on the physical switches.

also make sure that your physical network ports, are standard ports, not trunked, LACP etc

We chose to apply configuration via Storage Console, because it alert if not correct, apply, and then reboot server.
0
danbrown_IT ManagerAuthor Commented:
OK, let me restate this so I am sure I understand your recommendation.  I should remove the existing ifgroups from both Netapp heads.  I should then create just one ifgroup containing all four NICs, one for the top controller and one for the bottom controller.

I don't understand your next recommendation.  First you say use LACP for the filer ports on the physical switch (HP 2920).  The next line says make sure the ports are standard and not trunked or using LACP...which is correct?
0
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
Filer

Trunk Four Physical Ports (if you have LACP use it!)

Hosts (ESXi)

Standard Access Ports 2 of, no trunk, no LACP

Setup as per my Article.

Work with a single Controller first,

Make sure Partner Addresses are specified for correct take over and give back of both controllers.
0
danbrown_IT ManagerAuthor Commented:
It turned out most of this was not necessary.  Here is a screenshot of how my vSwitches are configured now:

vswitches
And the Netapp:
netapp.jpg
0
danbrown_IT ManagerAuthor Commented:
I've requested that this question be closed as follows:

Accepted answer: 0 points for danbrown_'s comment #a40355179

for the following reason:

Found own solution
0
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
That's not best Practice. You have no fail over. If you look at all the VMware documentation, My EE Article, and NetApp documentation, it's not supported, and not best practice.
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
danbrown_IT ManagerAuthor Commented:
Except my setup does not work when configured using best practices, this at least works.
0
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
Your configuration must be incorrect, you only need a single IP Address for controller. (not two!)

and I'm actually going to object, because you have trunked ALL four nics on the NetApp as per my post.

Have you had your configuration of the NetApp and VMware "runner stamped" by Professional Services, e.g. NetApp Vendor Engineers ?

or did you create it yourself ?
0
danbrown_IT ManagerAuthor Commented:
I actually left the VIFs as they were but changed the IP address on two of them (one top and one bottom).  However you've always been a real help in the past so if you want the points they're all yours!  Thanks for all of your help, very much appreciated.
0
danbrown_IT ManagerAuthor Commented:
My setup did not work in this configuration but Andrew has described the accepted best practice in detail which is how the initial setup should be configured.
0
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
Dan

Unless you have a reason to use iSCSI, NetApp recommends NFS and favours NFS these days, NFS has less overhead and performs better now than iSCSI!

NetApp Filers were originally built around NFS, and now we have the VAAI NFS Plugin from NetApp we can do Thick on NFS, and datamoves based on snapshots, so clone/copies are faster and handled by the NetApp Filer, and not the ESXi server!

It's worth considering, as you do not have to worry about LUNs, LUN Reserves, Snapshot Reserves etc

for VMware vSphere

We are moving Customers from iSCSI to NFS on NetApp!
0
danbrown_IT ManagerAuthor Commented:
Tempting, but after all the time it took to get setup using iSCSI (which the Netapp engineer who assisted with the setup actually recommended) I'm going to stick with it for now.  Maybe a slow transition over to NFS as things get migrated since it supports both.  Thanks again Andrew!
0
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
If you've got licenses for both and time try it!

I'm always here to help!!!

All the best

Andy
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
VMware

From novice to tech pro — start learning today.