llarava
asked on
VMWARE vSphere 4.1 - Network Load Balancing (NLB) Multicast Mode Configuration / Cisco 6500
This is what a quick overview of what we are trying to accomplish and the problem that we are having:
Goal:
We are trying to configure Windows NLB on 2 VMs.
The Windows NLB configuration:
VIP -> 172.20.200.204 - The virtual MAC is 03bf.ac14.c8cc
The NLB cluster is formed by
PSTS01 -> 172.20.200.205
PSTS02 -> 172.20.200.206
VMWARE/Network:
We have 5 ESX servers which are connected to out core switches Cisco 6500 and configured as follows:
!
interface GigabitEthernet9/24
description VM4 Data
switchport
switchport access vlan 200
switchport mode trunk
spanning-tree portfast
Issue:
We are not able reach the VIP 172.20.200.204 from any VLAN other than the VLAN 200. The VIP needs to be reachable in order for the cluster to work.
This article explains what we are trying to acomplish:
http://kb.vmware.com/selfservice/microsites/search.do?cmd=displayKC&externalId=1006558
We have followed the steps in the following article to configure the network:
http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1006525
We have use mac-address-table static with the Uplink interface for each of our ESX servers and their connection to the Core switches. We have used CDP to obtain the information.
CR01 - Core Router1
arp 172.20.200.204 03bf.ac14.c8cc ARPA
mac-address-table static 03bf.ac14.c8cc vlan 200 interface GigabitEthernet9/32 GigabitEthernet9/24 GigabitEthernet9/16 GigabitEthernet9/21 GigabitEthernet9/19
CR02 - Core Router2
arp 172.20.200.204 03bf.ac14.c8cc ARPA
mac-address-table static 03bf.ac14.c8cc vlan 200 interface GigabitEthernet9/41 GigabitEthernet9/24 GigabitEthernet9/11 GigabitEthernet9/17 GigabitEthernet9/15
As said before we are not able to reach the VIP -> 172.20.200.204 from any VLAN other than VLAN 200 but we are able to reach the Physical IPs assigned to the PSTS01 (172.20.200.205) and PSTS02 (172.20.200.206) from any of our VLANS.
Once the manual ARP resolution of the NLB cluster address is configured in our Core Router 1 and 2 we can see the static entry on the CAM.
Also we have verified that the following setting is configured as suggested in the KB - Virtual Switch NIC Team Policy > Notify Switches is set to Yes.
One more thing to add is that we can ping the VIP 172.20.200.204 from any of the Cisco 6500 and from any host virtual or physical that is configured at VLAN 200.
Also from any of Cisco 6500 the arp successfully resolves. If we do a show ip arp to the VIP 172.20.200.204 returns:
Internet 172.20.200.204 03bf.ac14.c8cc ARPA
Goal:
We are trying to configure Windows NLB on 2 VMs.
The Windows NLB configuration:
VIP -> 172.20.200.204 - The virtual MAC is 03bf.ac14.c8cc
The NLB cluster is formed by
PSTS01 -> 172.20.200.205
PSTS02 -> 172.20.200.206
VMWARE/Network:
We have 5 ESX servers which are connected to out core switches Cisco 6500 and configured as follows:
!
interface GigabitEthernet9/24
description VM4 Data
switchport
switchport access vlan 200
switchport mode trunk
spanning-tree portfast
Issue:
We are not able reach the VIP 172.20.200.204 from any VLAN other than the VLAN 200. The VIP needs to be reachable in order for the cluster to work.
This article explains what we are trying to acomplish:
http://kb.vmware.com/selfservice/microsites/search.do?cmd=displayKC&externalId=1006558
We have followed the steps in the following article to configure the network:
http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1006525
We have use mac-address-table static with the Uplink interface for each of our ESX servers and their connection to the Core switches. We have used CDP to obtain the information.
CR01 - Core Router1
arp 172.20.200.204 03bf.ac14.c8cc ARPA
mac-address-table static 03bf.ac14.c8cc vlan 200 interface GigabitEthernet9/32 GigabitEthernet9/24 GigabitEthernet9/16 GigabitEthernet9/21 GigabitEthernet9/19
CR02 - Core Router2
arp 172.20.200.204 03bf.ac14.c8cc ARPA
mac-address-table static 03bf.ac14.c8cc vlan 200 interface GigabitEthernet9/41 GigabitEthernet9/24 GigabitEthernet9/11 GigabitEthernet9/17 GigabitEthernet9/15
As said before we are not able to reach the VIP -> 172.20.200.204 from any VLAN other than VLAN 200 but we are able to reach the Physical IPs assigned to the PSTS01 (172.20.200.205) and PSTS02 (172.20.200.206) from any of our VLANS.
Once the manual ARP resolution of the NLB cluster address is configured in our Core Router 1 and 2 we can see the static entry on the CAM.
Also we have verified that the following setting is configured as suggested in the KB - Virtual Switch NIC Team Policy > Notify Switches is set to Yes.
One more thing to add is that we can ping the VIP 172.20.200.204 from any of the Cisco 6500 and from any host virtual or physical that is configured at VLAN 200.
Also from any of Cisco 6500 the arp successfully resolves. If we do a show ip arp to the VIP 172.20.200.204 returns:
Internet 172.20.200.204 03bf.ac14.c8cc ARPA
if you are using Multicast NLB, the method we use for clients is to include the Static MAC address in ALL Cisco switches in the organisation, that includes all uplinks and VLANs across the infrastructure.
ASKER
jordannet: this is what I've already tried as you can see above in my previous post.
hanccocka: ESX uplinks (interfaces) to the Cisco Switches or Routers in our case we use 6509
http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1006525
hanccocka: ESX uplinks (interfaces) to the Cisco Switches or Routers in our case we use 6509
http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1006525
Yes, familiar with the 6509.
We update ALL Cisco switches on the network!
It can be quite tiresome, to go around ALL the switches, and make the changes.
We update ALL Cisco switches on the network!
It can be quite tiresome, to go around ALL the switches, and make the changes.
ASKER
You got to do what you got to do. I haven't seen any article that indicates that I have to update all the switches. Do you have any article that do you want to share with me?
So far VMWare support have not answered my question and Cisco is saying that everything is configured correctly.
So far VMWare support have not answered my question and Cisco is saying that everything is configured correctly.
I'm afraid it's from experience, of completing this many times.
Long ago, ESX 2.5/3.0 days, there was an article, I've not been able to find since, that stated to statically allocate
Multicast Address (Mac Address against IP Address)
Node 1 Address (Mac Address against IP Address)
Node 2 Address (Mac Address against IP Address)
There was also a mention in the artcile to ensure, that the Node Addresses used, were also static MAC addresses for the VMs, and not auto-generated ones. and to enable Mac Address spoofing on the vSwitch.
This is the receipe, we have complete for many years (since 2003) with success, and often it's failed because Network Management, somewhere have not completed what we've asked, and we've had ALL the Switch configs off ALL the Cisco switches, to check, they've been done.
and have you allocated this to ALL your Trunks, and Uplinks everywhere, where you would expect the traffic to be.
Long ago, ESX 2.5/3.0 days, there was an article, I've not been able to find since, that stated to statically allocate
Multicast Address (Mac Address against IP Address)
Node 1 Address (Mac Address against IP Address)
Node 2 Address (Mac Address against IP Address)
There was also a mention in the artcile to ensure, that the Node Addresses used, were also static MAC addresses for the VMs, and not auto-generated ones. and to enable Mac Address spoofing on the vSwitch.
This is the receipe, we have complete for many years (since 2003) with success, and often it's failed because Network Management, somewhere have not completed what we've asked, and we've had ALL the Switch configs off ALL the Cisco switches, to check, they've been done.
and have you allocated this to ALL your Trunks, and Uplinks everywhere, where you would expect the traffic to be.
ASKER
Sound like a lot.
The KB article above mentiones that you have to make the following changes for the uplink interfaces on for your ESX servers.
arp 172.20.200.204 03bf.ac14.c8cc ARPA
mac-address-table static 03bf.ac14.c8cc vlan 200 interface GigabitEthernet9/41 GigabitEthernet9/24 GigabitEthernet9/11 GigabitEthernet9/17 GigabitEthernet9/15
The articel samples that following commands below, that being said can you please clarify what needs to be done on the ALL of the switches?
1.
Telnet in to Cisco Switch Console and log in.
2.
Run this command to enter Configuration mode:
config t
3.
STATIC ARP RESOLUTION Cisco Global command mode
For example:
arp [ip] [cluster multicast mac] ARPA
arp 192.168.1.100 03bf.c0a8.0164 ARPA
4.
STATIC MAC RESOLUTION Cisco Global command mode
For example:
mac-address-table static [cluster multicast mac] [vlan id] [interface]
mac-address-table static 03bf.c0a8.0164 vlan 1 interface GigabitEthernet1/1 GigabitEthernet1/2
GigabitEthernet1/15 GigabitEthernet1/16
The KB article above mentiones that you have to make the following changes for the uplink interfaces on for your ESX servers.
arp 172.20.200.204 03bf.ac14.c8cc ARPA
mac-address-table static 03bf.ac14.c8cc vlan 200 interface GigabitEthernet9/41 GigabitEthernet9/24 GigabitEthernet9/11 GigabitEthernet9/17 GigabitEthernet9/15
The articel samples that following commands below, that being said can you please clarify what needs to be done on the ALL of the switches?
1.
Telnet in to Cisco Switch Console and log in.
2.
Run this command to enter Configuration mode:
config t
3.
STATIC ARP RESOLUTION Cisco Global command mode
For example:
arp [ip] [cluster multicast mac] ARPA
arp 192.168.1.100 03bf.c0a8.0164 ARPA
4.
STATIC MAC RESOLUTION Cisco Global command mode
For example:
mac-address-table static [cluster multicast mac] [vlan id] [interface]
mac-address-table static 03bf.c0a8.0164 vlan 1 interface GigabitEthernet1/1 GigabitEthernet1/2
GigabitEthernet1/15 GigabitEthernet1/16
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
it depends on how compex you network is, and what needs to access the cluster, for most of our clients, it's external and internal, as clusters are often used for front facing Intranet and Internet websites, and Backend Content Management.
complex your network is.
ASKER
Our network is not that complex.
I got that we have to configure this on all of our switches.
same configuration on ALL the switches
arp 172.20.200.204 03bf.ac14.c8cc ARPA
arp 172.20.200.205 mac address ARPA
arp 172.20.200.206 mac address ARPA
But I am having difficulties understanding the second part. What exactly do we need to do with the TRUNKs on the switches? what about the uplinks? can you please help me out to understand this part.
and then, work out, what trunks and uplinks you would also see those Mac Addresses
mac-address-table static [cluster multicast mac] [vlan id] [interface]
mac-address-table static 03bf.c0a8.0164 vlan 1 interface GigabitEthernet1/1 GigabitEthernet1/2
GigabitEthernet1/15 GigabitEthernet1/16
I got that we have to configure this on all of our switches.
same configuration on ALL the switches
arp 172.20.200.204 03bf.ac14.c8cc ARPA
arp 172.20.200.205 mac address ARPA
arp 172.20.200.206 mac address ARPA
But I am having difficulties understanding the second part. What exactly do we need to do with the TRUNKs on the switches? what about the uplinks? can you please help me out to understand this part.
and then, work out, what trunks and uplinks you would also see those Mac Addresses
mac-address-table static [cluster multicast mac] [vlan id] [interface]
mac-address-table static 03bf.c0a8.0164 vlan 1 interface GigabitEthernet1/1 GigabitEthernet1/2
GigabitEthernet1/15 GigabitEthernet1/16
You need to also specify on the next switch that connects to the switch which the ESX servers are connected to, and also specify the Ports, that those Mac Addresses would be seen on.
I do not know how your network is configured, but does 6509 connect to any other switches?
or is everything contained in the 6509?
what often complicates it, is if you've split your servers to run across 2 x Core 6509s?
I do not know how your network is configured, but does 6509 connect to any other switches?
or is everything contained in the 6509?
what often complicates it, is if you've split your servers to run across 2 x Core 6509s?
ASKER
That is exactly what we are doing we are running 2 x Core 6509s. Our physical servers as well as the ESX servers are connected to both 6509s with HSRP. Our ESX servers are have 2 dedicated interfaces for data and each one is connectd to each one of the 6509's.
So all traffic is within the Core 6509s?
i think this will give you headlines :
http://spininfo.homelinux.com/news/VI_Perl_Toolkit/2011/12/06/VMWARE_vSphere_4.1_-_Network_Load_Balancing_(NLB)_Multicast_Mode_Configuration___Cisco_6500
http://spininfo.homelinux.com/news/VI_Perl_Toolkit/2011/12/06/VMWARE_vSphere_4.1_-_Network_Load_Balancing_(NLB)_Multicast_Mode_Configuration___Cisco_6500
ASKER
hanccocka: The other switches are also connected to the 6509s. So basically all the switches with the workstations, etc.. are conneceted to the 6509's. Also the ESX server and physical server are also connected to there.
jordannet: The link just directs me to my own question...
jordannet: The link just directs me to my own question...
so have you altered the switches the workstations are connected to?
and the port which connects 6509 to workstation switch you logged into workstation switch and configured the port for static mac address?
and the port which connects 6509 to workstation switch you logged into workstation switch and configured the port for static mac address?
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
hanccocka,
No I haven't. I am working with Cisco support trying to figure this out. We haven't seen anything yet that points us to configure the ARP on all of the switches.
The only place the ARP entry has been altered is the Core 6509's as indicated by the KB VMWare article.
Also cisco KB http://www.cisco.com/en/US/products/hw/switches/ps708/products_configuration_example09186a0080a07203.shtml don't seem to say anything about changing the ARP on all the switches.
jordannet: No worries everything is always welcome. Thank you!
No I haven't. I am working with Cisco support trying to figure this out. We haven't seen anything yet that points us to configure the ARP on all of the switches.
The only place the ARP entry has been altered is the Core 6509's as indicated by the KB VMWare article.
Also cisco KB http://www.cisco.com/en/US/products/hw/switches/ps708/products_configuration_example09186a0080a07203.shtml don't seem to say anything about changing the ARP on all the switches.
jordannet: No worries everything is always welcome. Thank you!
Well thats what we do, and it always works for us.
ASKER
hanccocka: We have found that one of the 6509s is not able to ping the VIP. The one that can't ping the VIP is the primary core 6509. The second 6509 which is in standby is able to ping the VIP.
We are planning on switching over the standby 6509 and test. We might get the same result where the active HSRP core switch doesn't reach the VIP or it might work which will indicate that there is a problem with te configuration with the current primary core switch configuration.
I will follow up once the change is done and we probably have to go with the change that you have indicated but before I want to have both 6509s reaching the VIP.
We are planning on switching over the standby 6509 and test. We might get the same result where the active HSRP core switch doesn't reach the VIP or it might work which will indicate that there is a problem with te configuration with the current primary core switch configuration.
I will follow up once the change is done and we probably have to go with the change that you have indicated but before I want to have both 6509s reaching the VIP.
glad you are making some progress.
ASKER
Hi hanccocka,
You've mentioned that we need to add the same configuration on ALL the switches
3 IP and MACs for the VIP and the nodes:
arp 172.20.200.204 03bf.ac14.c8cc ARPA
arp 172.20.200.205 mac address ARPA
arp 172.20.200.206 mac address ARPA
and then, work out, what trunks and uplinks you would also see those Mac Addresses
Do we need to use the mac-address-table static with just the VIP and Virtual MAC or VIP and virtual MAC and the IPs for node1 and node2 and their MACs?
mac-address-table static [cluster multicast mac] [vlan id] [interface]
mac-address-table static 03bf.c0a8.0164 vlan 1 interface GigabitEthernet1/1 GigabitEthernet1/2
GigabitEthernet1/15 GigabitEthernet1/16
Thank you
You've mentioned that we need to add the same configuration on ALL the switches
3 IP and MACs for the VIP and the nodes:
arp 172.20.200.204 03bf.ac14.c8cc ARPA
arp 172.20.200.205 mac address ARPA
arp 172.20.200.206 mac address ARPA
and then, work out, what trunks and uplinks you would also see those Mac Addresses
Do we need to use the mac-address-table static with just the VIP and Virtual MAC or VIP and virtual MAC and the IPs for node1 and node2 and their MACs?
mac-address-table static [cluster multicast mac] [vlan id] [interface]
mac-address-table static 03bf.c0a8.0164 vlan 1 interface GigabitEthernet1/1 GigabitEthernet1/2
GigabitEthernet1/15 GigabitEthernet1/16
Thank you
when weve configured switchrs, we used all mac addresses for all nodes, and the cluster mac address.
if you have a network diagram of how your switches are connected, and what ports are connected together it makes it easier.
if you have a network diagram of how your switches are connected, and what ports are connected together it makes it easier.
ASKER
hanccocka
We have configured the uplink ports (data interfaces) for our ESX servers as follows:
!
interface GigabitEthernet9/24
description VM4 Data
switchport
switchport access vlan 200
switchport mode trunk
spanning-tree portfast
We are not doing VLAN taggin just configuring them as trunks. This is been working great for years.
We have narrowed down the problem to the following:
We can get to the VIP from that same and different vlans only if the source server is a VM. Which indicates that vSwitch or PortGroups allow this to happen.
However we can't even get to the VIP from the same VLAN if we are using phyiscal servers which are connected to the same 6509.
Is the current configuration of the uplink something that it might be causing this behaviour? Any other ideas?
We have configured the uplink ports (data interfaces) for our ESX servers as follows:
!
interface GigabitEthernet9/24
description VM4 Data
switchport
switchport access vlan 200
switchport mode trunk
spanning-tree portfast
We are not doing VLAN taggin just configuring them as trunks. This is been working great for years.
We have narrowed down the problem to the following:
We can get to the VIP from that same and different vlans only if the source server is a VM. Which indicates that vSwitch or PortGroups allow this to happen.
However we can't even get to the VIP from the same VLAN if we are using phyiscal servers which are connected to the same 6509.
Is the current configuration of the uplink something that it might be causing this behaviour? Any other ideas?
We always configure our Cisco trunks like this
example confiiguration for the switch
interface GigabitEthernet2/8
description esxdev001
switchport trunk encapsulation dot1q
switchport trunk allowed vlan 5,7,8,703,705
switchport mode trunk
speed 1000
duplex full
spanning-tree portfast trunk
and then we Tag and Specify the Tag Numbers on the Virtual Portgroups on ESX.
So in your configuration all the traffic leaving ESX would be going onto VLAN 200?
and I'm assuming those servers are also attached to Access VLAN 200?
What if you change you configuration as above, present a trunk, and then tag ALL VLANs, and specify the portgroup on ESX.
example confiiguration for the switch
interface GigabitEthernet2/8
description esxdev001
switchport trunk encapsulation dot1q
switchport trunk allowed vlan 5,7,8,703,705
switchport mode trunk
speed 1000
duplex full
spanning-tree portfast trunk
and then we Tag and Specify the Tag Numbers on the Virtual Portgroups on ESX.
So in your configuration all the traffic leaving ESX would be going onto VLAN 200?
and I'm assuming those servers are also attached to Access VLAN 200?
What if you change you configuration as above, present a trunk, and then tag ALL VLANs, and specify the portgroup on ESX.
ASKER
Everything I have said about source VMs being able to consistently reach the VIP is not true. The system misbehaves sometimes it works from the same vlan sometimes it work from a different VLAN and physical or virtual.
Running a capture from the NLB nodes I have seen that the pings to the VIP are not going to the MAC for the virtual IP instead they are going to the vNIC MAC address that is associated with the VM itself.
It is my undertanding that the ping should move form the 6509-1 to the MAC that belong to the VIP address . Correct?
172.20.200.204 - VIP
capture.jpg
Running a capture from the NLB nodes I have seen that the pings to the VIP are not going to the MAC for the virtual IP instead they are going to the vNIC MAC address that is associated with the VM itself.
It is my undertanding that the ping should move form the 6509-1 to the MAC that belong to the VIP address . Correct?
172.20.200.204 - VIP
capture.jpg
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
http://www.cisco.com/en/US/products/hw/switches/ps708/products_configuration_example09186a0080a07203.shtml