Solved

VMware ESX4 link aggregation with Juniper EX8208 causes NFS failure

Posted on 2011-02-18
5
1,570 Views
Last Modified: 2012-05-11
We have 4 ESX4 nodes, 2 connected to an EX8208 in our primary location and 2 to another EX8208 in our secondary location. The two sites are directly connected. Each VMware node has three aggregated port groups; one for management and vmotion, one for virtual machine traffic and one dedicated to NAS traffic using NFS. We have two NAS units; one IBM N6040 in the primary location and one IBM N3600 in the secondary location.

Each VMware node can mount and access the NFS volumes; existing VMs show up and it is possible to SSH to the node and read/write files under /vmfs/volumes

However, if we try to "Edit Configuration", power-up or create a new virtual machine, the operation will time out with a "general system error" and the relevant .vmx file becomes corrupted. A reboot of the VMware node and the NAS confirms that the file has indeed been truncated, this is not a cache issue. Both NAS units have been updated to the recommended ONTAP version.

The problem occurs even if the aggregated link has only one member link. If I then reconfigure that link to be a plain access port, everything works as expected. This aids troubleshooting but is not a viable solution.


On the VMware side, the virtual switch settings are as recommended by VMware: load balancing="route based on ip hash", failover detection="link status only", notify switches="yes", failback="no". All member links are configured as "active adapters".

On the Juniper side, the member links and aggregated interfaces are configured as follows:

    ge-0/0/8 {                
        ether-options {                
            802.3ad ae10;              
        }                              
    }            

    ae10 {
        traceoptions {
            flag all;
        }
        mtu 1500;
        aggregated-ether-options {
            link-speed 1g;
        }
        unit 0 {
            family ethernet-switching {
                port-mode access;
                vlan {
                    members 130;
                }
            }
        }
    }

Notice that we are not using LACP since VMware only supports static link aggregation.

The NAS vendor, switch vendor as well as VMware have been trying to solve this problem for weeks and we're getting nowhere. I'm looking for others with a similar setup (port aggregation, Juniper switches and NFS)
0
Comment
Question by:FloydATC
  • 3
  • 2
5 Comments
 
LVL 18

Expert Comment

by:deimark
Comment Utility
What does the juniper say about the aggregated links?

show interfaces terse
show interfaces ae10 extensive
show interfaces ge-0/0/8 extensive
0
 

Author Comment

by:FloydATC
Comment Utility
Sorry for the late response. Interfaces terse is lengthy and doesn't offer any surprises so I won't include the complete output here.

oikt@FERADH-SW001> show interfaces ae10 extensive 
Physical interface: ae10, Enabled, Physical link is Up
  Interface index: 138, SNMP ifIndex: 685, Generation: 141
  Description: FERADH-VM01-SAN
  Link-level type: Ethernet, MTU: 1500, Speed: 1Gbps, BPDU Error: None, MAC-REWRITE Error: None,
  Loopback: Disabled, Source filtering: Disabled, Flow control: Disabled, Minimum links needed: 1,
  Minimum bandwidth needed: 0
  Device flags   : Present Running
  Interface flags: SNMP-Traps Internal: 0x0
  Current address: b0:c6:9a:cd:c2:0b, Hardware address: b0:c6:9a:cd:c2:0b
  Last flapped   : 2011-02-18 11:14:53 UTC (4d 00:31 ago)
  Statistics last cleared: Never
  Traffic statistics:
   Input  bytes  :            109617954                 2176 bps
   Output bytes  :          39723756226                 1488 bps
   Input  packets:               792144                    2 pps
   Output packets:             30080101                    1 pps
   IPv6 transit statistics:
    Input  bytes  :                   0 
    Output bytes  :                   0
    Input  packets:                   0
    Output packets:                   0
  Input errors:
    Errors: 0, Drops: 0, Framing errors: 0, Runts: 0, Giants: 0, Policed discards: 0, Resource errors: 0
  Output errors:
    Carrier transitions: 153, Errors: 0, Drops: 0, MTU errors: 0, Resource errors: 0

  Logical interface ae10.0 (Index 66) (SNMP ifIndex 699) (Generation 131)
    Flags: SNMP-Traps 0x0 Encapsulation: ENET2
    Statistics        Packets        pps         Bytes          bps
    Bundle:
        Input :             0          0             0            0
        Output:       1308515          0     140007933            0
    Marker Statistics:   Marker Rx     Resp Tx   Unknown Rx   Illegal Rx
      ge-0/0/8.0                 0           0            0            0
    Protocol eth-switch, Generation: 145, Route table: 0
      Flags: None

oikt@FERADH-SW001> show interfaces ge-0/0/8 extensive 
Physical interface: ge-0/0/8, Enabled, Physical link is Up
  Interface index: 161, SNMP ifIndex: 575, Generation: 164
  Description: FERADH-VM01-SAN
  Link-level type: Ethernet, MTU: 1500, Speed: 1000mbps, Duplex: Auto, BPDU Error: None,
  MAC-REWRITE Error: None, Loopback: Disabled, Source filtering: Disabled, Flow control: Disabled,
  Auto-negotiation: Enabled, Remote fault: Online
  Device flags   : Present Running
  Interface flags: SNMP-Traps Internal: 0x0
  Link flags     : None
  CoS queues     : 8 supported, 8 maximum usable queues
  Hold-times     : Up 0 ms, Down 0 ms
  Current address: b0:c6:9a:cd:c2:0b, Hardware address: b0:c6:9a:cd:c2:09
  Last flapped   : 2011-02-18 11:14:53 UTC (4d 00:31 ago)
  Statistics last cleared: Never
  Traffic statistics:
   Input  bytes  :            109618966                    0 bps
   Output bytes  :          39723758744                  512 bps
   Input  packets:               792151                    0 pps
   Output packets:             30080130                    1 pps
   IPv6 transit statistics:
    Input  bytes  :                   0 
    Output bytes  :                   0
    Input  packets:                   0
    Output packets:                   0
  Input errors:
    Errors: 0, Drops: 0, Framing errors: 0, Runts: 0, Policed discards: 0, L3 incompletes: 0,
    L2 channel errors: 0, L2 mismatch timeouts: 0, FIFO errors: 0, Resource errors: 0
  Output errors:
    Carrier transitions: 153, Errors: 0, Drops: 0, Collisions: 0, Aged packets: 0, FIFO errors: 0,
    HS link CRC errors: 0, MTU errors: 0, Resource errors: 0
  Egress queues: 8 supported, 7 in use
  Queue counters:       Queued packets  Transmitted packets      Dropped packets
    0 best-effort                    0             28878173                    0
    1 assured-forw                   0                    0                    0
    2 mcast-be                       0                    0                    0
    4 mcast-ef                       0                    0                    0
    5 expedited-fo                   0                    0                    0
    6 mcast-af                       0                    0                    0
    7 network-cont                   0              1128679                    0
  Active alarms  : None
  Active defects : None
  MAC statistics:                      Receive         Transmit
    Total octets                     109618966      39723758744
    Total packets                       792151         30080130
    Unicast packets                     746050         28880707
    Broadcast packets                      633            24181
    Multicast packets                    45468          1175242
    CRC/Align errors                         0                0
    FIFO errors                              0                0
    MAC control frames                       0                0
    MAC pause frames                         0                0
    Oversized frames                       169
    Jabber frames                            0
    Fragment frames                          0
    Code violations                          0
  Autonegotiation information:
    Negotiation status: Complete
    Link partner:
        Link mode: Full-duplex, Flow control: Symmetric/Asymmetric, Remote fault: OK,
        Link partner Speed: 1000 Mbps   
    Local resolution:
        Flow control: None, Remote fault: Link OK
  Packet Forwarding Engine configuration:
    Destination slot: 0
  CoS information:
    Direction : Output 
    CoS transmit queue               Bandwidth               Buffer Priority   Limit
                              %            bps     %           usec
    0 best-effort            75      750000000    75              0      low    none
    2 mcast-be               20      200000000    20              0      low    none
    7 network-control         5       50000000     5              0      low    none

  Logical interface ge-0/0/8.0 (Index 95) (SNMP ifIndex 722) (Generation 262)
    Flags: 0x0 Encapsulation: ENET2
    Local statistics:
     Input  bytes  :                    0
     Output bytes  :             21169374
     Input  packets:                    0
     Output packets:               185178
    Transit statistics:
     Input  bytes  :                    0                    0 bps
     Output bytes  :                    0                    0 bps
     Input  packets:                    0                    0 pps
     Output packets:                    0                    0 pps
    Protocol aenet, AE bundle: ae10.0, Generation: 275, Route table: 0

Open in new window


Also, I have configured a syslog
file interface-log {
    any any;
    match ifOperStatus;
}

Open in new window

and I'm not seeing any activity there.
0
 

Accepted Solution

by:
FloydATC earned 0 total points
Comment Utility
Seems we finally found the solution on our own: Remove the MTU setting on the aggregated interface ae10. The virtual switch is set to MTU 1500 but this must be matched with an MTU setting of at least 1514 on the Juniper.
0
 
LVL 18

Expert Comment

by:deimark
Comment Utility
Hiya bud

Sorry, missed your 1st reply to this.

Yes, that would make sense now, just never really used link aggregation without LACP so was doing a bit of digging.

Thanks for letting us know.
0
 

Author Closing Comment

by:FloydATC
Comment Utility
Solved
0

Featured Post

Top 6 Sources for Identifying Threat Actor TTPs

Understanding your enemy is essential. These six sources will help you identify the most popular threat actor tactics, techniques, and procedures (TTPs).

Join & Write a Comment

HOW TO: Connect to the VMware vSphere Hypervisor 6.5 (ESXi 6.5) using the vSphere (HTML5 Web) Host Client 6.5, and perform a simple configuration task of adding a new VMFS 6 datastore.
In this article, I will show you HOW TO: Create your first Windows Virtual Machine on a VMware vSphere Hypervisor 6.5 (ESXi 6.5) Host Server, the Windows OS we will install is Windows Server 2016.
Teach the user how to use configure the vCenter Server storage filters Open vSphere Web Client:  Navigate to vCenter Server Advanced Settings: Add the four vCenter Server storage filters: Review the advanced settings: Modify the values of the four v…
Teach the user how to install and configure the vCenter Orchestrator virtual appliance Open vSphere Web Client: Deploy vCenter Orchestrator virtual appliance OVA file: Verify vCenter Orchestrator virtual appliance boots successfully: Connect to the …

763 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

9 Experts available now in Live!

Get 1:1 Help Now