• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 96
  • Last Modified:

Hyper-v 2016 nothing can access the VMs, but VMs can access everything.

This is a new one for me.  I have lots of hours on Hyper-V 2008 R2 and 2012 R2.  I have lots of hours on fail over clusters with HPE Proliant DL380 GenX servers with MSA storage.  I am aware of the Broadcom driver bug that requires disabling VMQ.  I built a new Windows 2016 Hyper-V fail over cluster on HPE Proliant DL380 Gen9 servers and MSA storage.  Everything working perfectly, except networking.

The VMs on the cluster CAN obtain DHCP addresses form a server elsewhere on the network (or I can set them statically - it doesn't matter how the VMs are addressed) and they also CAN access the corporate network AND the internet as expected.  But nothing can access the VMs, cannot access shares, cannot RDP, cannot ping.  Also, none of the VMs can access other VMs on the same cluster and exhibit the same behavior.

Just so you can grasp the strangeness of this, imagine virtual servers VM1 and VM2 on the cluster, and NONVM1 outside the cluster.  VM1 can access NONVM1, shares, RDP, ping, etc, as expected.  But, even though NONVM1 knows VM1 exists due to DNS resolution only, it cannot access it at all, no shares, no RDP, no ping.  Also, VM1 CANNOT access VM2 on the same cluster and vice versa.

All drivers and firmware are up to date. I have tried and assumed it was VMQ settings, but this does not appear to be the cause.  The Cluster Validation runs and passes everything.

I'm at a total loss to explain this one.

Any ideas, and thank you in advance.
0
Michael Jackson
Asked:
Michael Jackson
1 Solution
 
David Johnson, CD, MVPOwnerCommented:
how have you setup the virtual network switch that is connected to the VM?
It should be external if you want to access the vm from another physical machine
  • External vSwitch will link a physical NIC of the Hyper-V host with a virtual one and then give your VMs access outside of the host, meaning your physical network and internet (if your physical network is connected to internet).
  • Internal vSwitch should be used for building an independent virtual network when you need to connect VMs to each other and to a hypervisor as well.
  • Private vSwitch will create a virtual network where all connected VMs will see each other, but not the Hyper-V host. This will completely isolate the VMs in that sandbox.
0
 
Stose TreseCommented:
You might have problem with host OS. Are VM1 and VM2 hosted on the same server?
0
 
Philip ElderTechnical Architect - HA/Compute/StorageCommented:
Elevated Powershell from both nodes:
Get-NetAdapter
Get-NetLbfoTeam
Get-VMSwitch
Get-VM | FL

Open in new window


Please post the results into a CODE window or into a TXT file and post that.
0
Cloud Class® Course: Microsoft Azure 2017

Azure has a changed a lot since it was originally introduce by adding new services and features. Do you know everything you need to about Azure? This course will teach you about the Azure App Service, monitoring and application insights, DevOps, and Team Services.

 
Michael JacksonPresidentAuthor Commented:
David Johnson, yes, there are virtual switches configured, three per server, each with the same name as corresponding virtual switches on the other node.  VL1, VL2, and VL3.

Stose Trese, this happens to all VMs on both of the fail over hosts, the VMs can Live Migrate to the other node and experience the same behavior.

Philip Elder:  Here you go.

This is VS1

PS C:\Windows\system32> Get-NetAdapter

Name                      InterfaceDescription                    ifIndex Status       MacAddress             LinkSpeed
----                      --------------------                    ------- ------       ----------             ---------
vEthernet (VL3)           Hyper-V Virtual Ethernet Adapter #3          10 Up           94-18-82-00-E0-52         1 Gbps
vEthernet (VL2)           Hyper-V Virtual Ethernet Adapter #2          25 Up           94-18-82-00-E0-50         1 Gbps
vEthernet (VL1)           Hyper-V Virtual Ethernet Adapter             21 Up           94-18-82-00-E0-51         1 Gbps
Embedded FlexibleLOM ...1 HPE FlexFabric 10Gb 2-port 533FLR-...#2      18 Not Present  EC-B1-D7-B2-E4-F8          0 bps
Embedded FlexibleLOM ...2 HPE FlexFabric 10Gb 2-port 533FLR-T ...       3 Not Present  EC-B1-D7-B2-E4-FC          0 bps
Heartbeat                 HPE Ethernet 1Gb 4-port 331i Adapter #2       5 Up           94-18-82-00-E0-53         1 Gbps
LAN3                      HPE Ethernet 1Gb 4-port 331i Adapter #4      16 Up           94-18-82-00-E0-52         1 Gbps
LAN2                      HPE Ethernet 1Gb 4-port 331i Adapter          2 Up           94-18-82-00-E0-51         1 Gbps
LAN1                      HPE Ethernet 1Gb 4-port 331i Adapter #3       6 Up           94-18-82-00-E0-50         1 Gbps


PS C:\Windows\system32> Get-NetLbfoTeam
PS C:\Windows\system32> Get-VMSwitch

Name SwitchType NetAdapterInterfaceDescription
---- ---------- ------------------------------
VL1  External   HPE Ethernet 1Gb 4-port 331i Adapter
VL2  External   HPE Ethernet 1Gb 4-port 331i Adapter #3
VL3  External   HPE Ethernet 1Gb 4-port 331i Adapter #4

PS C:\Windows\system32> Get-VM | FL
 
Name             : removed
State            : Running
CpuUsage         : 0
MemoryAssigned   : 8401190912
MemoryDemand     : 756023296
MemoryStatus     :
Uptime           : 17:59:46.6940000
Status           : Operating normally
ReplicationState : Disabled
Generation       : 2

Name             : removed
State            : Running
CpuUsage         : 0
MemoryAssigned   : 33554432000
MemoryDemand     : 1342177280
MemoryStatus     :
Uptime           : 17:59:44.2010000
Status           : Operating normally
ReplicationState : Disabled
Generation       : 2

This is VS2

PS C:\Windows\system32> Get-NetAdapter

Name                      InterfaceDescription                    ifIndex Status       MacAddress             LinkSpeed
----                      --------------------                    ------- ------       ----------             ---------
vEthernet (VL3)           Hyper-V Virtual Ethernet Adapter #3          19 Up           1C-98-EC-17-45-DC         1 Gbps
vEthernet (VL2)           Hyper-V Virtual Ethernet Adapter #2          20 Up           1C-98-EC-17-45-DE         1 Gbps
vEthernet (VL1)           Hyper-V Virtual Ethernet Adapter             25 Up           1C-98-EC-17-45-DD         1 Gbps
Heartbeat                 HPE Ethernet 1Gb 4-port 331i Adapter #3      10 Up           1C-98-EC-17-45-DF         1 Gbps
LAN3                      HPE Ethernet 1Gb 4-port 331i Adapter #2       8 Up           1C-98-EC-17-45-DE         1 Gbps
LAN2                      HPE Ethernet 1Gb 4-port 331i Adapter         17 Up           1C-98-EC-17-45-DD         1 Gbps
LAN1                      HPE Ethernet 1Gb 4-port 331i Adapter #4      21 Up           1C-98-EC-17-45-DC         1 Gbps
Embedded FlexibleLOM ...2 HPE FlexFabric 10Gb 2-port 533FLR-T ...      11 Not Present  EC-B1-D7-AF-D3-04          0 bps
Embedded FlexibleLOM ...1 HPE FlexFabric 10Gb 2-port 533FLR-...#2       7 Not Present  EC-B1-D7-AF-D3-00          0 bps


PS C:\Windows\system32> Get-NetLbfoTeam
PS C:\Windows\system32> Get-VMSwitch

Name SwitchType NetAdapterInterfaceDescription
---- ---------- ------------------------------
VL1  External   HPE Ethernet 1Gb 4-port 331i Adapter
VL2  External   HPE Ethernet 1Gb 4-port 331i Adapter #2
VL3  External   HPE Ethernet 1Gb 4-port 331i Adapter #4


PS C:\Windows\system32> Get-VM | FL


Name             :removed
State            : Running
CpuUsage         : 0
MemoryAssigned   : 8401190912
MemoryDemand     : 756023296
MemoryStatus     :
Uptime           : 18:01:15.2630000
Status           : Operating normally
ReplicationState : Disabled
Generation       : 2

Name             : removed
State            : Running
CpuUsage         : 0
MemoryAssigned   : 33554432000
MemoryDemand     : 1342177280
MemoryStatus     :
Uptime           : 18:01:13.0440000
Status           : Operating normally
ReplicationState : Disabled
Generation       : 2

Name             : removed
State            : Running
CpuUsage         : 0
MemoryAssigned   : 4294967296
MemoryDemand     : 772800512
MemoryStatus     :
Uptime           : 18:01:12.5600000
Status           : Operating normally
ReplicationState : Disabled
Generation       : 2
0
 
Philip ElderTechnical Architect - HA/Compute/StorageCommented:
To be blunt that's a mess. KISS is best especially in a cluster setting.

The 10GbE ports are not plugged in?

NOTE: DISABLE Virtual Machine Queues at the _port_ level in the Broadcom driver for all Broadcom Gigabit ports! Do this before running the below steps.

Here's my suggestion to be done on both nodes:
1: Remove all vSwitches from the VMs
2: Remove all vSwitches from the hosts
3: Plug in the 10GbE ports even if it's into a Gigabit switch
4: New-NetLbfoTeam -Name Management -TeamMembers LAN1,"Embedded*1" -Confirm:$False
5: New-NetLbfoTeam -Name vSwitch -TeamMembers * -Confirm:$False
6: New-VMSwitch -Name vSwitch -NetAdapterName "vSwitch" -AllowManagementOS 0
7: Add the new vSwitch to the VMs on both hosts.

My Hyper-V Hardware and Software Best Practices article has a lot more on how to configure the hosts.

Oh, and there is no need for a heartbeat network.

If there is no 10GbE switch present then another option is to use a good CAT6 grade patch cable to direct connect them. Make sure the low MAC ports are connected together and the high MAC ports same. Assign them a subnet and use them for Live Migration. If this is done, then do _not_ include them in a team as indicated above.
0
 
Michael JacksonPresidentAuthor Commented:
Philip Elder,

I have plenty of flight time on Hyper V and HP servers with Broadcom adapters and the VMQ.

No the 10GBE are not attached because I have no 10GBE infrastructure.  1 GB only.  The environment is not so large and the four on-board 1 GB adapters will suffice.

Thanks for sharing your preferred method of vSwitch configuration, I did not think mine is a mess, its just different than yours.  

Having said that, I entertained your configuration and the results are the same:

The VMs and hosts gain a DHCP address from a server outside of the cluster.  I've tried static addresses as well, which eventually the systems will have once this is figured out.

The VMs can ping and access everything on the network and internet as expected except the other VMs in the cluster within the vSwitch.

No other device on the network can ping nor access the VMs including the hosts.

Its likely I will open an MS case as the only thing different form my previous 10 virtual server deploys is that this one is the first on Windows 2016.
0
 
Philip ElderTechnical Architect - HA/Compute/StorageCommented:
No worries. Having a single NIC port as a virtual switch is a single point of failure. It's best to team at least two ports so that redundancy is built-in.

Check the switch setup. It sounds like something is blocking at the switch. Perhaps an errant LAG setting or VLAN setup that has not been trunked correctly?
0
 
Michael JacksonPresidentAuthor Commented:
Thanks again.

We don't appear to be having switch issues.  We have two different brands of switches, the two older are HP Procurve, the newer are Cisco Small Business.  There is no complexity in the network configuration.  No vlans, only native VLAN1.  No errors registering.  We tried connecting to both switches.

I'm telling you, it's elusive.  Please assume I have done all of the "normal" troubleshooting, and think instead of the absurd reason this might be happening.  LOL
0
 
Philip ElderTechnical Architect - HA/Compute/StorageCommented:
Any network QoS rules set?
Windows Firewall on the inbound VMs set to Log and the log checked for dropped packets when an attempt is made to connect to them?
Network Gateway Virtualization enabled and blocking inbound packets?
Windows Azure Pack NVGRe in play?
Virtual Machine Manager in play with broken rules?
0
 
Michael JacksonPresidentAuthor Commented:
F_I_R_E_W_A_L_L

When this first happened, I assumed maybe the HOST firewall was blocking, so I shut it down.  To no avail.

Well, I just went in and dropped the firewall on the VMs and now all is working.

So this is a behavior of the VMs running Windows 2016 I did not expect, the Windows 2016 firewall apparently blocks EVERYTHING from the inbound traffic until I do something about it.  I.e. the standard Windows exceptions are not enabled by default like every single other version of Windows to exist, so ICMP, file sharing, and I imagine everything else is blocked.

Thanks for all your input.

I'm just going to go sit in the corner and shake my bowed head in my hands for a few hours.... mumbling incoherently under my breath.

Thanks to Philip Elder on this and an unrelated aspect, I am going to leave the Team in place and try it out.
0
 
Philip ElderTechnical Architect - HA/Compute/StorageCommented:
Okay, suggestion 1:

For all affected VMs run the following in an elevated CMD:

sc config "NlaSvc"  start= delayed-auto

That sets the Network Location Awareness to a delayed start. It sometimes runs its network polls before the network stack has finished initializing which results in the firewall profile set to PUBLIC. That means full lockdown. I suggest doing the same for the host(s). We do.

Suggestion 2:

Set up a Group Policy Object that sets the Windows Firewall profiles to ENABLED, new outbound protocol Pop-Up to ENABLED, and Logging to ENABLED. A quick glance at the Windows Firewall Log would give an immediate "Ah, this is the culprit" or "Nope, not here. Need to look elsewhere".

Disabling the Windows Firewall places it in a form of Limp Mode. It never gets turned off.
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Featured Post

Free Tool: Path Explorer

An intuitive utility to help find the CSS path to UI elements on a webpage. These paths are used frequently in a variety of front-end development and QA automation tasks.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

Tackle projects and never again get stuck behind a technical roadblock.
Join Now