Eric_Price
asked on
Access Denied error when attempting to validate a 2012 R2 HyperV Server cluster
I am unable to get my freshly installed servers to validate for the purposes of creating a cluster in Hyper-V 2012 R2 Server.
They are 3 HP Gen9 DL385 servers with fresh installs of 2012R2 Hyper-V Server (ie NOT the regular 2012 R2 Standard or Datacenter), installed from the same media and having identical hardware. No patches have been installed. I got this same error when attempting to validate the cluster with 2012 R2 Datacenter with GUI loaded and all available patches installed.
When I run the validation wizard from "failover cluster manager "I get an error on the "List software updates" section. It also fails in a couple of other places, but always with the same error, which is ...
The full report is attached to this question as a text file, with only a find/replace for my domain name.
Failover-Cluster-Validation-Report.txt
The servers all joined the domain correctly, and all indications are they can ping the internet, the domain controllers, each other, the WSUS server, any other thing Id care to try. A few of the NICs are teamed, and trunked, but a dedicated management NIC is in place.
I found some stuff online about the error, but it appears to apply to situations where machines are cloned, which is not the case here.
I've tried variations on the wizard, including testing only 2 machines at a time (in each of the various combinations) but the result is always the same. Ive tried turning off the firewalls. The machines are currently in a Hyper-V server OU in Active Directory. Group policy is blocked for that OU. The only two group policies currently in place are 1) setting the list of accounts which have local admin rights on the hyper-v servers and 2) disabling any reference to the intranet WSUS server (put in place as a troubleshooting step only - normally the group policy here sets the intranet WSUS server.
So, thoughts on what this might be, or how I might troubleshoot this further?
They are 3 HP Gen9 DL385 servers with fresh installs of 2012R2 Hyper-V Server (ie NOT the regular 2012 R2 Standard or Datacenter), installed from the same media and having identical hardware. No patches have been installed. I got this same error when attempting to validate the cluster with 2012 R2 Datacenter with GUI loaded and all available patches installed.
When I run the validation wizard from "failover cluster manager "I get an error on the "List software updates" section. It also fails in a couple of other places, but always with the same error, which is ...
An error occurred while executing the test.
An error occurred while getting information about the software updates installed on the nodes.
One or more errors occurred.
Creating an instance of the COM component with CLSID {4142DD5D-3472-4370-8641-DE7856431FB 0} from the IClassFactory failed due to the following error: 80070005 Access is denied. (Exception from HRESULT: 0x80070005 (E_ACCESSDENIED)).
The full report is attached to this question as a text file, with only a find/replace for my domain name.
Failover-Cluster-Validation-Report.txt
The servers all joined the domain correctly, and all indications are they can ping the internet, the domain controllers, each other, the WSUS server, any other thing Id care to try. A few of the NICs are teamed, and trunked, but a dedicated management NIC is in place.
I found some stuff online about the error, but it appears to apply to situations where machines are cloned, which is not the case here.
I've tried variations on the wizard, including testing only 2 machines at a time (in each of the various combinations) but the result is always the same. Ive tried turning off the firewalls. The machines are currently in a Hyper-V server OU in Active Directory. Group policy is blocked for that OU. The only two group policies currently in place are 1) setting the list of accounts which have local admin rights on the hyper-v servers and 2) disabling any reference to the intranet WSUS server (put in place as a troubleshooting step only - normally the group policy here sets the intranet WSUS server.
So, thoughts on what this might be, or how I might troubleshoot this further?
"1) setting the list of accounts which have local admin rights on the hyper-v servers"
That is *LIKELY* your issue. The validation wizard will remotely check all nodes and thus needs local admin privileges on the other machines as well. That means that there are various service groups that get granted various permissions to properly remotely scan. But if you are forcibly changing those permissions via group policy, you may be missing some important required groups and so the validation wizard is failing.
That is *LIKELY* your issue. The validation wizard will remotely check all nodes and thus needs local admin privileges on the other machines as well. That means that there are various service groups that get granted various permissions to properly remotely scan. But if you are forcibly changing those permissions via group policy, you may be missing some important required groups and so the validation wizard is failing.
ASKER
I am logged in with a DA account. The problem exists even when the machines were still in the Computers Container. Yes regrettably my predecessor saw fit to change the Default Domain Policy. No, I dont think there is anything in there that would be the cause of this.
Still, I'll create the new OU and block inheritance and give it a whirl. Stay tuned.
Still, I'll create the new OU and block inheritance and give it a whirl. Stay tuned.
ASKER
Ok, running Failover Cluster Manager Tool from a fully patched 2012 R2 standard guest VM on another 2012 R2 Hyper-V server host. Guest is in 192.168.0.0/22 subnet. DNS and AD are in that same subnet.
New hosts are in 192.168.158.0/24 subnet. No ACLs on switch. Full pingability. No troubles getting new hosts to join domain. No reason to suspect a network problem.
New hosts have x1 2port 10GB NIC and x1 4port 1GB NIC. 10GB port is teamed with LACP on the switch. 2 of the 4 1GB NICs are teamed with LACP on the switch.
Virtual external switches for two VLANs (LAN and DMZ) are setup, and configured to NOT share interface with host.
As a result each host has just two IPs AT THE MOMENT - 1 for management (port 3 of 4 on embedded NIC) on the 192.168.158.0/24 network with a default gateway that is the core switch and 1 for heartbeat (port 4 of 4 on embedded NIC) on the 192.168.159.0/24 network (no default gateway)
As a side note, these hosts are all configured with a FC HBA and loaded with the MPIO feature to connect to a 3PAR 7200 series storage unit.
The 3 hosts are now in their own OU with inheritance blocked and no group policies applied. Error remains. Descending into madness.
New hosts are in 192.168.158.0/24 subnet. No ACLs on switch. Full pingability. No troubles getting new hosts to join domain. No reason to suspect a network problem.
New hosts have x1 2port 10GB NIC and x1 4port 1GB NIC. 10GB port is teamed with LACP on the switch. 2 of the 4 1GB NICs are teamed with LACP on the switch.
Virtual external switches for two VLANs (LAN and DMZ) are setup, and configured to NOT share interface with host.
As a result each host has just two IPs AT THE MOMENT - 1 for management (port 3 of 4 on embedded NIC) on the 192.168.158.0/24 network with a default gateway that is the core switch and 1 for heartbeat (port 4 of 4 on embedded NIC) on the 192.168.159.0/24 network (no default gateway)
As a side note, these hosts are all configured with a FC HBA and loaded with the MPIO feature to connect to a 3PAR 7200 series storage unit.
The 3 hosts are now in their own OU with inheritance blocked and no group policies applied. Error remains. Descending into madness.
ASKER
Also, I found this article by adjusting my search parameters on google a bit
https://techontip.wordpress.com/2011/05/08/cluster-validation-error-an-error-occurred-while-executing-the-test-there-was-an-error-initializing-the-network-tests-there-was-an-error-creating-the-server-side-agent-cprepsrv-creating-an-inst/
Which, while it MAY work I feel like a) shouldnt be necessary and b) isnt fully explaining the effects of what you are accepting in making this change. As a guy who does a fair bit of security, I get nervous when people want to impersonate things...
I also found this
http://capitalhead.com/articles/failover-cluster-validation-error-80070005-on-windows-server-2008-r2-x64.aspx
Which suggested it may be a com permission issue that needed to be reset, but it didnt say a) why it would be that way or (more importantly) how one would actually go about resetting it. He fixed his problem by adding an old 2003 DC in. As I have a 2012 R2 functional AD now, that isnt an option (and wouldnt be one I would accept anyway)
https://techontip.wordpress.com/2011/05/08/cluster-validation-error-an-error-occurred-while-executing-the-test-there-was-an-error-initializing-the-network-tests-there-was-an-error-creating-the-server-side-agent-cprepsrv-creating-an-inst/
Which, while it MAY work I feel like a) shouldnt be necessary and b) isnt fully explaining the effects of what you are accepting in making this change. As a guy who does a fair bit of security, I get nervous when people want to impersonate things...
I also found this
http://capitalhead.com/articles/failover-cluster-validation-error-80070005-on-windows-server-2008-r2-x64.aspx
Which suggested it may be a com permission issue that needed to be reset, but it didnt say a) why it would be that way or (more importantly) how one would actually go about resetting it. He fixed his problem by adding an old 2003 DC in. As I have a 2012 R2 functional AD now, that isnt an option (and wouldnt be one I would accept anyway)
ASKER
Hmmm, the plot thickens.
I went back and reran the cluster validation wizard against an existing 2012 R2 cluster I have in production and it fails now with the same error. Something has changed. The move to 2012 R2 AD schema? The disabling of Netbios over TCPIP? Who knows. Time to start looking back through the change log to see if there are any OBVIOUS culprits.
Seems pretty clear though the issue is NOT with the current hardware / software config. *sigh*
Thoughts?
I went back and reran the cluster validation wizard against an existing 2012 R2 cluster I have in production and it fails now with the same error. Something has changed. The move to 2012 R2 AD schema? The disabling of Netbios over TCPIP? Who knows. Time to start looking back through the change log to see if there are any OBVIOUS culprits.
Seems pretty clear though the issue is NOT with the current hardware / software config. *sigh*
Thoughts?
FULL GUI installed? Run Process Monitor/Explorer to figure out where specifically the problem lies.
ASKER
at the moment I am doing this with the 2012 R2 Hyper-V Server (standalone) which has no GUI to speak of. FWIW I got this same error when I had 2012 R2 Datacenter loaded on these with the same setup.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
So theres something tightened too tightly here. Even the GPResult fails with an permission-related error. Im closing this ticket, and going back to do some basic troubleshooting.
Thanks for your thoughts.
Thanks for your thoughts.
Has the Default Domain Policy been modified? Are you logged in with a proper domain administrator account?