VMware Networking with NetApp NFS

We use to have our NetApp connected via iSCSI, and as you can see below, vSwitch1 had two VMkernels connected. networkingWe have since moved away from using iSCSI and now only use NFS. We bought new servers, and want to setup networking focused on NFS. The new servers have 7 available NICs.

Because each vSwitch is NIC Teamed and connected to two different switches, we are not able to take advantage of link aggregation in terms of increased bandwitdh.

Can you please recommend how we can change this networking strategy to maintain our redundancy to different switches, and also take advantage of link aggregation (increased bandwidth)?

If we're no longer using iSCSI and only using NFS, do we still need vSwitch1 to connect to our NFS based NetApp? Trying to see if we can allocated more NICs to another vSwitch maybe.
LVL 8
pzozulkaAsked:
Who is Participating?
 
Paul SolovyovskyConnect With a Mentor Senior IT AdvisorCommented:
Keep in mind that there is no MPIO with NFS, therefore the sessions will be one to one, so even if you have link aggregation but using one IP on the Netapp you'll never go over a single session.  To mitigate this create aliases on your NFS mounts and use a different IP for each datastore.

You can still use your vmkernel ports for iscsi, (connected to the san side of the netapp vif)  

If you're running 8.1.2 Ontap or higher you can install NFS Plug-in that will enable hardware acceleration on the datastores (through the VSC)

You can use 2 vmnic ports per vswitch, single ip, as long as there are multiple IPs for the datastores

How is the Netapp vifs setup?  lacp, multi, single?  On another note if you have enterprise plus you can use LACP.
0
 
dipopoCommented:
Yes, you can re-name iSCSI vmkernel port for NFS. Bottom line you still need that vmkernel port on vSwitch1.

I can see a service console port, hence you are running ESX not ESXi, most likely upgrading to ESXi don't be alarmed this will be removed.
0
 
dipopoConnect With a Mentor Commented:
As I understand the above, and your previous questions.

These are vStandard switches
No LACP Support
Each vSwitch has 2 uplinks across 2 switches

So the only thing you could possibly do is use Load balancing for the 2 uplinks, cant use IP Hash because of no LACP, so use "Route based on originating port ID" better than MAC address based routing. This will distribute VM traffic evenly amongst participating uplinks.

http://www.vmware.com/files/pdf/virtual_networking_concepts.pdf

Else you will need to stack your switches!



http://en.wikipedia.org/wiki/Stackable_switch
0
Get expert help—faster!

Need expert help—fast? Use the Help Bell for personalized assistance getting answers to your important questions.

 
pzozulkaAuthor Commented:
We are running ESXi 5.0. The image above represents our current config -- and I'm asking about our NEW config for the new servers we just purchased (ESXi 5.1).

What are the minimum networking requirements in terms of number of vSwitches, number of VMkernels, number of port gorups, etc?

I'm wondering if maybe for :

the new vSwitch0, we leave things as is -- 2 NICs for VMotion and Service Console.

the new vSwitch1, we bundle the current vSwitch1 and vSwitch2 together. This would allow us to utilize 4 NICs (2 to each switch), which would allow for link aggregation in terms of bandwidth.

The only problem I see with this is we would be mixing storage (NetApp) traffic with regular network traffic going accross the same pipe -- although different VLANs.
0
 
dipopoConnect With a Mentor Commented:
Ideally I would separate my storage network uplinks! and the minimum number of uplinks for a vswitch is 0 (internal vswitch). As currently set-up with 2 for redundancy is best.

You speak of gaining bandwidth, have you experienced any network contention issues with the current set-up?

I might also consider Jumbo frames if storage traffic is a concern as such.

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1038827
0
 
pzozulkaAuthor Commented:
We are running 8.1.0 ONTAP (FAS 2240). For VMware, we are on the new servers, we will be running ESXi 5.1 Enterprise.

LACP is configured on the physical switch ports connecting to the NetApp.
For the VIFS, details are below. Looks like it's setup for a single IP.

vif ifgrp1 contains e0a & e0b
vif ifgrp2 contains e0c & e0d
vif ifgrp3 containts vifs ifgrp1 & ifgrp2

/etc/hosts
127.0.0.1       localhost       localhost-stack
127.0.10.1      localhost-10    localhost-bsd
127.0.20.1      localhost-20    localhost-sk
10.0.128.55     i1whlnetapp01   i1whlnetapp01-ifgrp3
10.0.128.72     i1whlnetapp01-e0M
10.0.32.10      mailhost
10.30.0.50      i1lvnetapp01

Open in new window

/etc/rc
hostname i1whlnetapp01
ifgrp create lacp ifgrp1 -b ip e0a e0b
ifgrp create lacp ifgrp2 -b ip e0c e0d
ifgrp create single ifgrp3 ifgrp1 ifgrp2
ifconfig ifgrp3 `hostname`-ifgrp3 mediatype auto netmask 255.255.255.224 partner ifgrp6 mtusize 1500
ifconfig e0M `hostname`-e0M netmask 255.255.255.240 mtusize 1500
route add default 10.0.128.61 1
routed on
options dns.domainname domainName.local
options dns.enable on
options nis.enable off
savecore

Open in new window

ifconfig
e0a: flags=0x9f0c867<BROADCAST,RUNNING,MULTICAST,TCPCKSUM> mtu 1500
        ether 02:a0:98:1a:e6:8e (auto-1000t-fd-up) flowcontrol full
        trunked ifgrp1
e0b: flags=0x9f0c867<BROADCAST,RUNNING,MULTICAST,TCPCKSUM> mtu 1500
        ether 02:a0:98:1a:e6:8e (auto-1000t-fd-up) flowcontrol full
        trunked ifgrp1
e0c: flags=0x9f0c867<BROADCAST,RUNNING,MULTICAST,TCPCKSUM> mtu 1500
        ether 02:a0:98:1a:e6:8e (auto-1000t-fd-up) flowcontrol full
        trunked ifgrp2
e0d: flags=0x9f0c867<BROADCAST,RUNNING,MULTICAST,TCPCKSUM> mtu 1500
        ether 02:a0:98:1a:e6:8e (auto-1000t-fd-up) flowcontrol full
        trunked ifgrp2
e0M: flags=0x2b4c867<UP,BROADCAST,RUNNING,MULTICAST,TCPCKSUM> mtu 1500
        inet 10.0.128.72 netmask 0xfffffff0 broadcast 10.0.128.79
        ether 00:a0:98:1a:e6:91 (auto-100tx-fd-up) flowcontrol full
e0P: flags=0x2b4c867<UP,BROADCAST,RUNNING,MULTICAST,TCPCKSUM> mtu 1500 PRIVATE
        inet 192.168.3.171 netmask 0xfffffc00 broadcast 192.168.3.255 noddns
        ether 00:a0:98:1a:e6:90 (auto-100tx-fd-up) flowcontrol full
lo: flags=0x1b48049<UP,LOOPBACK,RUNNING,MULTICAST,TCPCKSUM> mtu 8160
        inet 127.0.0.1 netmask 0xff000000 broadcast 127.0.0.1
        ether 00:00:00:00:00:00 (VIA Provider)
losk: flags=0x40a400c9<UP,LOOPBACK,RUNNING> mtu 9188
        inet 127.0.20.1 netmask 0xff000000 broadcast 127.0.20.1
ifgrp1: flags=0x2af0c863<BROADCAST,RUNNING,MULTICAST,TCPCKSUM> mtu 1500
        ether 02:a0:98:1a:e6:8e (Enabled interface groups)
        trunked ifgrp3
ifgrp2: flags=0x2af0c863<BROADCAST,RUNNING,MULTICAST,TCPCKSUM> mtu 1500
        ether 02:a0:98:1a:e6:8e (Enabled interface groups)
        trunked ifgrp3
ifgrp3: flags=0x22f4c863<UP,BROADCAST,RUNNING,MULTICAST,TCPCKSUM> mtu 1500
        inet 10.0.128.55 netmask 0xffffffe0 broadcast 10.0.128.63
        partner ifgrp6 (not in use)
        ether 02:a0:98:1a:e6:8e (Enabled interface groups)

Open in new window

0
 
pzozulkaAuthor Commented:
After doing a ton of research, it is absolutely fine to have a NIC team on a vSwitch policy having one of the NICs connected to Switch one and having the other NIC connected to switch two. In the event one of the switches fails or needs to go down for maintenance, the network traffic keeps flowing uninterrupted.

WIth this NIC team strategy, you're not going to get link aggregation in terms of bandwidth improvement, but more than likely (unless its a huge network), you probably not going to get much of a bandwidth improvement with link aggregation to begin with. We have about 40 VMs, and our network monitoring software is telling us that we're only using 10% of our bandwidth on average on our 1 Gbps network.

Lastly, even with our setup, we are still doing load balancing both ways -- traffic leaving the ESXi hosts, and traffic going back. Why? Our NIC Team policy on the ESXi hosts is set to "Route based on the originating virtual switch port ID". "When you use this setting, traffic from a given virtual Ethernet adapter is consistently sent to the same physical adapter unless there is a failover to another adapter in the NIC team. Replies are received on the same physical adapter as the physical switch learns the port association. This setting provides an even distribution of traffic if the number of virtual Ethernet adapters is greater than the number of physical adapters." -- VMware Virtual Networking Concepts (page 8). The reason we are getting load balancing using this NIC Team policy is because some of the VMs will choose to consistently use one of the NICs in the team, while other VMs are consistently using the other NIC.

Lastly there is a big difference between load balancing and load sharing. The previous paragraph is referring to NIC team policy for your production VM network traffic. It's a bit different for vSwitch NIC team settings for your IP Storage connectivity (which VMware recommends to separate away from your VM network traffic). The difference is as follows: Even if you setup full on link aggregation -- even if you set route based on IP hash as a NIC teaming option and Etherchannel(Cisco) on your connected switch to support link aggregation and redundancy in both directions -- "There is only one active pipe for the connection between the ESX server and a single storage target. This means that although there may be alternate connections available for failover, the bandwidth for a signle datastore and the underlying storage is limited to what a single connection can provide." -- VMware NFS Best Practices (page 7).
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.