UCS_Staff
asked on
Microsoft Cluster Validation: Failed to connect to service manager on 'ServerName'
I'm trying to add node to a SQL Server failover cluster but the installer reports that the MSCS Cluster Service verification report has failures. When I try to validate the cluster from the 2nd (new) node, it says "Failed to connect to the service manager on 'node1' (below is the full error message from Event Viewer). The validation works perfectly when run on node 1.
Here is some additional info:
Any suggestions? What can I do?
Log Name: Microsoft-Windows-Failover Clustering -Manager/A dmin
Source: Microsoft-Windows-Failover Clustering -Manager
Date: 11/6/2013 4:46:04 PM
Event ID: 4681
Task Category: Failover Clusters Manager MMC Snapin
Level: Error
Keywords:
User: AREA52\$svc.tyfq.cairs
Computer: 52TYFQ-DB-CRD2P.area52.afn oapps.usaf .mil
Description:
Failover Cluster Manager could not contact node '52TYFQ-DB-CRD1P'.
System.ApplicationExceptio n: Failed to open the event session. ---> System.ComponentModel.Win3 2Exception : The RPC server is unavailable
--- End of inner exception stack trace ---
Server stack trace:
at MS.Internal.ServerClusters .EventLogS ession.Ope nSession(S tring server, Int32 timeout)
at MS.Internal.ServerClusters .EventLogS ession..ct or(String serverName)
at MS.Internal.ServerClusters .Managemen t.EventLog QuerySet.< >c__Displa yClass3.<Q ueryWorker >b__0()
at System.Runtime.Remoting.Me ssaging.St ackBuilder Sink._Priv ateProcess Message(In tPtr md, Object[] args, Object server, Int32 methodPtr, Boolean fExecuteInContext, Object[]& outArgs)
at System.Runtime.Remoting.Me ssaging.St ackBuilder Sink.Async ProcessMes sage(IMess age msg, IMessageSink replySink)
Exception rethrown at [0]:
at System.Runtime.Remoting.Pr oxies.Real Proxy.EndI nvokeHelpe r(Message reqMsg, Boolean bProxyCase)
at System.Runtime.Remoting.Pr oxies.Remo tingProxy. Invoke(Obj ect NotUsed, MessageData& msgData)
at MS.Internal.ServerClusters .Managemen t.EventLog QuerySet.A syncCallDe legate`1.E ndInvoke(I AsyncResul t result)
at MS.Internal.ServerClusters .Managemen t.EventLog QuerySet.E xecuteAsyn cCall[T](A syncCallDe legate`1 asyncCall)
at MS.Internal.ServerClusters .Managemen t.EventLog QuerySet.Q ueryWorker (Object a)
Event Xml:
<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
<System>
<Provider Name="Microsoft-Windows-Fa iloverClus tering-Man ager" Guid="{11B3C6B7-E06F-4191- BBB9-7099F FF55614}" />
<EventID>4681</EventID>
<Version>0</Version>
<Level>2</Level>
<Task>1</Task>
<Opcode>0</Opcode>
<Keywords>0x80000000000000 00</Keywor ds>
<TimeCreated SystemTime="2013-11-06T15: 46:04.4695 28900Z" />
<EventRecordID>12</EventRe cordID>
<Correlation />
<Execution ProcessID="3260" ThreadID="2360" />
<Channel>Microsoft-Windows -FailoverC lustering- Manager/Ad min</Chann el>
<Computer>52TYFQ-DB-CRD2P. area52.afn oapps.usaf .mil</Comp uter>
<Security UserID="S-1-5-21-127140985 8-10958837 07-2794662 393-673364 6" />
</System>
<EventData>
<Data Name="Parameter1">52TYFQ-D B-CRD1P</D ata>
<Data Name="Parameter2">System.A pplication Exception: Failed to open the event session. ---> System.ComponentModel.Win3 2Exception : The RPC server is unavailable
--- End of inner exception stack trace ---
Server stack trace:
at MS.Internal.ServerClusters .EventLogS ession.Ope nSession(S tring server, Int32 timeout)
at MS.Internal.ServerClusters .EventLogS ession..ct or(String serverName)
at MS.Internal.ServerClusters .Managemen t.EventLog QuerySet.& lt;>c__ DisplayCla ss3.<Qu eryWorker& gt;b__0()
at System.Runtime.Remoting.Me ssaging.St ackBuilder Sink._Priv ateProcess Message(In tPtr md, Object[] args, Object server, Int32 methodPtr, Boolean fExecuteInContext, Object[]& outArgs)
at System.Runtime.Remoting.Me ssaging.St ackBuilder Sink.Async ProcessMes sage(IMess age msg, IMessageSink replySink)
Exception rethrown at [0]:
at System.Runtime.Remoting.Pr oxies.Real Proxy.EndI nvokeHelpe r(Message reqMsg, Boolean bProxyCase)
at System.Runtime.Remoting.Pr oxies.Remo tingProxy. Invoke(Obj ect NotUsed, MessageData& msgData)
at MS.Internal.ServerClusters .Managemen t.EventLog QuerySet.A syncCallDe legate`1.E ndInvoke(I AsyncResul t result)
at MS.Internal.ServerClusters .Managemen t.EventLog QuerySet.E xecuteAsyn cCall[T](A syncCallDe legate`1 asyncCall)
at MS.Internal.ServerClusters .Managemen t.EventLog QuerySet.Q ueryWorker (Object a)</Data>
</EventData>
</Event>
Here is some additional info:
Both servers are identical with respects to hardware configuration
Both servers are using a 2 NIC 802.3ad (aka LACP) team and are sitting on the same network switch with no network or personal firewalls in between them
The cluster was built using these these 2 servers successfully on the a test/staging network and domain before rebuilt on the production network/domain
Both servers have been joined to the 2 node cluster
The SQL Server cluster was installed/created on node 1 successfully
All active directory objects were pre-built in the production domain before the servers were added to the production domain and the cluster created
The same domain user account is being used on both servers and it has the same rights on the local machines and well as to the OU where the servers and cluster objects are in AD
Any suggestions? What can I do?
Log Name: Microsoft-Windows-Failover
Source: Microsoft-Windows-Failover
Date: 11/6/2013 4:46:04 PM
Event ID: 4681
Task Category: Failover Clusters Manager MMC Snapin
Level: Error
Keywords:
User: AREA52\$svc.tyfq.cairs
Computer: 52TYFQ-DB-CRD2P.area52.afn
Description:
Failover Cluster Manager could not contact node '52TYFQ-DB-CRD1P'.
System.ApplicationExceptio
--- End of inner exception stack trace ---
Server stack trace:
at MS.Internal.ServerClusters
at MS.Internal.ServerClusters
at MS.Internal.ServerClusters
at System.Runtime.Remoting.Me
at System.Runtime.Remoting.Me
Exception rethrown at [0]:
at System.Runtime.Remoting.Pr
at System.Runtime.Remoting.Pr
at MS.Internal.ServerClusters
at MS.Internal.ServerClusters
at MS.Internal.ServerClusters
Event Xml:
<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
<System>
<Provider Name="Microsoft-Windows-Fa
<EventID>4681</EventID>
<Version>0</Version>
<Level>2</Level>
<Task>1</Task>
<Opcode>0</Opcode>
<Keywords>0x80000000000000
<TimeCreated SystemTime="2013-11-06T15:
<EventRecordID>12</EventRe
<Correlation />
<Execution ProcessID="3260" ThreadID="2360" />
<Channel>Microsoft-Windows
<Computer>52TYFQ-DB-CRD2P.
<Security UserID="S-1-5-21-127140985
</System>
<EventData>
<Data Name="Parameter1">52TYFQ-D
<Data Name="Parameter2">System.A
--- End of inner exception stack trace ---
Server stack trace:
at MS.Internal.ServerClusters
at MS.Internal.ServerClusters
at MS.Internal.ServerClusters
at System.Runtime.Remoting.Me
at System.Runtime.Remoting.Me
Exception rethrown at [0]:
at System.Runtime.Remoting.Pr
at System.Runtime.Remoting.Pr
at MS.Internal.ServerClusters
at MS.Internal.ServerClusters
at MS.Internal.ServerClusters
</EventData>
</Event>
Is Windows firewall enabled (especially on the cluster network?) if so disable it. From each server can you connect to the \\servername\admin$ share? I presume this is a LAN cluster using networks in the same subnet.
ASKER
Radweld,
Windows Firewall is disabled. I'll have my colleagues check to see if we can connect to the share you specify. I do know that we can connect to \\servername\c$.
Windows Firewall is disabled. I'll have my colleagues check to see if we can connect to the share you specify. I do know that we can connect to \\servername\c$.
Interested in how they were rebuilt, were they sysprepped or fully re-installed? Did you run WINRM Quickonfig to quickly configure windows remote management?
ASKER
I fully re-installed them. I did not run WINRM Quickconfig. I've never heard of that.
ASKER
Radweld,
I'm no longer at the site where the servers are located. I will be able to go backj there the first week of Dec.
However, I will try to get me colleagues that are there to follow the steps you provided.
Thanks.
I'm no longer at the site where the servers are located. I will be able to go backj there the first week of Dec.
However, I will try to get me colleagues that are there to follow the steps you provided.
Thanks.
Also make sure IPV6 is still enabled (technically bound)
ASKER
Why does IPV6 need to be enabled? Since there is no IPv6 DHCP server on the network we've had issues caused by the additional traffic from machines searching for an IPv6 DHCP server.
Windows 2008R2 and later uses IPV6 by default, the servers will prioritise over IPv6 instead of IPv4 unless you hacked the registry to promote IPV4. Usually peoples botched efforts to disable IPv6 results in issues like this, I've seen similar issues configuring Exchange 2010 DAGS.
ASKER
So, do all I need to do is re-enable the IPv6 protocol on the network adapter(s)?
I'm not saying having IPv6 unbound from the adapters will resolve your issue but it might be contributing. IPv6 should never be unbound unless you have a pretty good reason.
http://blogs.technet.com/b/jlosey/archive/2011/02/02/why-you-should-leave-ipv6-alone.aspx
http://blogs.technet.com/b/jlosey/archive/2011/02/02/why-you-should-leave-ipv6-alone.aspx
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Re-imaging a machine is usually a last resort option.
ASKER
What else can I try?