Link to home
Start Free TrialLog in
Avatar of waltforbes
waltforbesFlag for Bahamas

asked on

VLAN Fails on VM Portgroup

Points of My Scenario:
1. I have 3 ESXi standalone servers (A, B, and C) identically configured and connected to physical network.

2. The vswitch0 for each ESXi server is connected to 2 physical NICs.

3. The switch ports connecting all ESXi servers are trunked to the same VLANs 1, 2, 3, and 4.

4. I have test VMs running on each ESXi server.

5. Whenever I configure the VMs onto VLANs 1, 2, and 3 portgroups - they are all able to communicate with each other.

6. Whenever I configure VMs onto VLAN 4 portgroup, only VMs on ESXi servers A and C can communicate.

7. I have removed and re-added VLAN 4 to ESXi server B using the web UI and CLI (and rebooted after each removal/re-add): still no success.

8. HOWEVER, when I configure ESXi server B's management portgroup to VLAN4, this physical ESXi B server is able to ping VMs on ESXi A and ESXi C - and, VMs on ESXi A and ESXi C can ping the physical ESXi B: proving VLAN4 is working on the physical ports used by ESXi B.

9. I have also executed "Reset System Configuration" under "System Customization" at the DCUI, then re-configured and performed testing as above (getting same results).

QUESTION: What else can I troubleshoot to figure out why the VLAN 4 portgroup on ESXi B won't work.

Avatar of Andrew Hancock (VMware vExpert PRO / EE Fellow/British Beekeeper)
Andrew Hancock (VMware vExpert PRO / EE Fellow/British Beekeeper)
Flag of United Kingdom of Great Britain and Northern Ireland image

I would look at physical configuration of the network switches and compare with all configurations of ESXi hosts
Avatar of waltforbes

ASKER

This is exactly what I did with the network engineers. I made them show me the configuration --- the ports are identically configured.

Many thanks for the suggestion though.
Also consider that when the ESXi B's management port is configured on VLAN 4, it can ping virtual machines (using a VLAN4 portgroup) on other ESXi hosts, and those VMs can ping the address of [physical] ESXi B also. So VLAN 4 is working on the physical switch ports used by ESXi B.

Yes....quite the mystery!
Something is misconfigured

Take a few days away from the issue we have sat for two weeks with network issues and printed out all configurations and gone through line by line checking and found errors!
I agree, Andrew! I will. Thank you so much.
Hi,

yes it should definetely be an issue in configuration, either ESX or physical switches
a few checks :
- VM portgroup for vlan 4 is finely tagged in ESX B
- and whole vswitch configuration is exactly the same between the 3 ESX
- check vlan on physical switches, also native vlan, as I don't know what you're tagging or not
- ...and other physical port configuration options

My first checks are VM portgroup and vswitch because as you said, from the physical switch ports perspective, it seems vlan 4 is working on ESX B....
Can you post the IP config for each server, please?
@Mr Tortu(r)e: Where do I check for fine tags? I've not used tagging before. Only me is working on these 3 standalone ESXi servers: they are brand new installations with only test VMs. This exercise is to prep the physical boxes, making sure everythingabout the network is working.

@sone one: are you referring to the test VMs or the physical servers? Note: the physical server IP addressing is considered sensitive info by my organization --- i wouldn't be able to publish them. The test VMs are using arbitrary IPs unrelated to my organization's IP addressing scheme - just soI can test.

The VLAN Tags should be on the Virtual Machine Portgroup. (vSwitch).

If you don't have VLAN tags, not sure how traffic for each VLAN is flowing down each VLAN, unless you have access ports configured on each trunk and have multiple vSwitches for each trunk VLAN.
@Andrew Hancock: I have one vSwitch (the default vSwitch0) per server, configured with multiple portgroups. Each portgroup has a different VLAN number assigned - including number "4" - (I didn't realize this is what "Mr. Tortu(r)e" meant when he stated "VM portgroup for vlan 4 is finely tagged in ESX B"). The physical switch ports are all trunked to include all the VLANs we use.

All portgroups work on server A and C. All portgroups minus one (vlan 4) work server B.
Interestingly, when I config server B to use vlan4 directly (i.e. setting its management portgroup to vlan4, temporarily), it can communicate with other vlan4-portgroup-VMs residing in servers A and C.
@Andrew Hancock: I have one vSwitch (the default vSwitch0) per server, configured with multiple portgroups. Each portgroup has a different VLAN number assigned - including number "4" - (I didn't realize this is what "Mr. Tortu(r)e" meant when he stated "VM portgroup for vlan 4 is finely tagged in ESX B"). The physical switch ports are all trunked to include all the VLANs we use.

that is the VLAN 802.1Q TAG.

I think screenshots would help us here.
Just the IPs for the VMs at this point would be fine. I'd like to see IP and subnet mask if possible, please.


Please find attached the VMs each on a separate ESXi host - showing hostnames, IP addresses, and results of trying to ping each other.
Working servers (VMs) 1 and 3 are together while VM 2 is on the end (right)User generated image

Thanks.

Can you explain what purpose the VLANs are serving if all of your VMs are on the same subnet, please? Is this setup being used at this stage purely to test VLAN connectivity between hosts and VMs?

As Andrew said, it would be useful to see screenshots of the portgroup configuration, as well as the switch configuration.
Hi "some one": you are correct - the setup being used at this stage is purely to test VLAN connectivity/functionality.
So far all other VLANs function on all ESXi hosts' VM portgroups....Only vlan4 fails as a VM portgroup on ESXi B.

What's mysterious is that when ESXi B's management portgroup is configured with vlan4, this physical host can bi-directionally communicate with other VMs (on other ESXi hosts - A & C) using vlan4 via their VM portgroups.

How is it that ESXi B can only use vlan4 directly, but not for its VM portgroups on its vSwitch0 (all ESXi hosts have only this one vSwitch0)? So strange, yes?
 Hi,

VLAN for ESX management and VM Portgroup are separate options
So is it possible that you tag one and not the other when doing test on vlan4?
That could explain why one is working and not the other.
What's mysterious is that when ESXi B's management portgroup is configured with vlan4, this physical host can bi-directionally communicate with other VMs (on other ESXi hosts - A & C) using vlan4 via their VM portgroups.

That actually sounds correct to me. I think you have a tagging or portgroup problem. We'd need to see the portgroup config, the vswitch config and the switchport config in order to pinpoint the issue.
1. Upload screenshots of all the vSwitch(s).

2. Upload configurations of the physical switch (command) config.

a VLAN Tag is missing from the physical switch configuration or vSwitch Portgroup.

or mistyped.
Eurekaaaa!!

SOLUTION: Using the DCUI, I removed the physical network connections (network adapters) from the "Management Network" - BUT one at a time; restarting the the "Management Network" when prompted, and testing before removing the next. After removing each NIC one at a time, I added the two together...
Result 1: NIC1 removal = no change (failure)
Result 2: NIC2 removal + NIC1 re-add = success
Result 3: NIC2 re-added (along with NIC1) = success

Andrew, "Mr. Tortu(r)e", and "some one" - your patience with me has been legendary! I couldn't provide screenshots and switch configs without breaching company policy..it was such a dilemma - I knew I would have requested the same from you! So I must say, thank you all ever so much for sticking with me so far on this. I don't take it for granted; I am grateful.

@Andrew: in the end, your advice to "Take a few days away from the issue" (24 hours in my case) was the magic bullet!!

A great many thanks to you all!
ASKER CERTIFIED SOLUTION
Avatar of Andrew Hancock (VMware vExpert PRO / EE Fellow/British Beekeeper)
Andrew Hancock (VMware vExpert PRO / EE Fellow/British Beekeeper)
Flag of United Kingdom of Great Britain and Northern Ireland image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Glad you found it :)