Solved

Microsoft Cluster Validation: Failed to connect to service manager on 'ServerName'

Posted on 2013-11-06
16
3,662 Views
Last Modified: 2014-03-02
I'm trying to add node to a SQL Server failover cluster but the installer reports that the MSCS Cluster Service verification report has failures.  When I try to validate the cluster from the 2nd (new) node, it says "Failed to connect to the service manager on 'node1' (below is the full error message from Event Viewer).  The validation works perfectly when run on node 1.  

Here is some additional info:
Both servers are identical with respects to hardware configuration
Both servers are using a 2 NIC 802.3ad (aka LACP) team and are sitting on the same network switch with no network or personal firewalls in between them
The cluster was built using these these 2 servers successfully on the a test/staging network and domain before rebuilt on the production network/domain
Both servers have been joined to the 2 node cluster
The SQL Server cluster was installed/created on node 1 successfully
All active directory objects were pre-built in the production domain before the servers were added to the production domain and the cluster created
The same domain user account is being used on both servers and it has the same rights on the local machines and well as to the OU where the servers and cluster objects are in AD


Any suggestions?  What can I do?

Log Name:      Microsoft-Windows-FailoverClustering-Manager/Admin
Source:        Microsoft-Windows-FailoverClustering-Manager
Date:          11/6/2013 4:46:04 PM
Event ID:      4681
Task Category: Failover Clusters Manager MMC Snapin
Level:         Error
Keywords:      
User:          AREA52\$svc.tyfq.cairs
Computer:      52TYFQ-DB-CRD2P.area52.afnoapps.usaf.mil
Description:
Failover Cluster Manager could not contact node '52TYFQ-DB-CRD1P'.

System.ApplicationException: Failed to open the event session. ---> System.ComponentModel.Win32Exception: The RPC server is unavailable
   --- End of inner exception stack trace ---

Server stack trace:
   at MS.Internal.ServerClusters.EventLogSession.OpenSession(String server, Int32 timeout)
   at MS.Internal.ServerClusters.EventLogSession..ctor(String serverName)
   at MS.Internal.ServerClusters.Management.EventLogQuerySet.<>c__DisplayClass3.<QueryWorker>b__0()
   at System.Runtime.Remoting.Messaging.StackBuilderSink._PrivateProcessMessage(IntPtr md, Object[] args, Object server, Int32 methodPtr, Boolean fExecuteInContext, Object[]& outArgs)
   at System.Runtime.Remoting.Messaging.StackBuilderSink.AsyncProcessMessage(IMessage msg, IMessageSink replySink)

Exception rethrown at [0]:
   at System.Runtime.Remoting.Proxies.RealProxy.EndInvokeHelper(Message reqMsg, Boolean bProxyCase)
   at System.Runtime.Remoting.Proxies.RemotingProxy.Invoke(Object NotUsed, MessageData& msgData)
   at MS.Internal.ServerClusters.Management.EventLogQuerySet.AsyncCallDelegate`1.EndInvoke(IAsyncResult result)
   at MS.Internal.ServerClusters.Management.EventLogQuerySet.ExecuteAsyncCall[T](AsyncCallDelegate`1 asyncCall)
   at MS.Internal.ServerClusters.Management.EventLogQuerySet.QueryWorker(Object a)
Event Xml:
<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
  <System>
    <Provider Name="Microsoft-Windows-FailoverClustering-Manager" Guid="{11B3C6B7-E06F-4191-BBB9-7099FFF55614}" />
    <EventID>4681</EventID>
    <Version>0</Version>
    <Level>2</Level>
    <Task>1</Task>
    <Opcode>0</Opcode>
    <Keywords>0x8000000000000000</Keywords>
    <TimeCreated SystemTime="2013-11-06T15:46:04.469528900Z" />
    <EventRecordID>12</EventRecordID>
    <Correlation />
    <Execution ProcessID="3260" ThreadID="2360" />
    <Channel>Microsoft-Windows-FailoverClustering-Manager/Admin</Channel>
    <Computer>52TYFQ-DB-CRD2P.area52.afnoapps.usaf.mil</Computer>
    <Security UserID="S-1-5-21-1271409858-1095883707-2794662393-6733646" />
  </System>
  <EventData>
    <Data Name="Parameter1">52TYFQ-DB-CRD1P</Data>
    <Data Name="Parameter2">System.ApplicationException: Failed to open the event session. ---&gt; System.ComponentModel.Win32Exception: The RPC server is unavailable
   --- End of inner exception stack trace ---

Server stack trace:
   at MS.Internal.ServerClusters.EventLogSession.OpenSession(String server, Int32 timeout)
   at MS.Internal.ServerClusters.EventLogSession..ctor(String serverName)
   at MS.Internal.ServerClusters.Management.EventLogQuerySet.&lt;&gt;c__DisplayClass3.&lt;QueryWorker&gt;b__0()
   at System.Runtime.Remoting.Messaging.StackBuilderSink._PrivateProcessMessage(IntPtr md, Object[] args, Object server, Int32 methodPtr, Boolean fExecuteInContext, Object[]&amp; outArgs)
   at System.Runtime.Remoting.Messaging.StackBuilderSink.AsyncProcessMessage(IMessage msg, IMessageSink replySink)

Exception rethrown at [0]:
   at System.Runtime.Remoting.Proxies.RealProxy.EndInvokeHelper(Message reqMsg, Boolean bProxyCase)
   at System.Runtime.Remoting.Proxies.RemotingProxy.Invoke(Object NotUsed, MessageData&amp; msgData)
   at MS.Internal.ServerClusters.Management.EventLogQuerySet.AsyncCallDelegate`1.EndInvoke(IAsyncResult result)
   at MS.Internal.ServerClusters.Management.EventLogQuerySet.ExecuteAsyncCall[T](AsyncCallDelegate`1 asyncCall)
   at MS.Internal.ServerClusters.Management.EventLogQuerySet.QueryWorker(Object a)</Data>
  </EventData>
</Event>
0
Comment
Question by:Uniqueinc
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 8
  • 6
16 Comments
 

Author Comment

by:Uniqueinc
ID: 39633352
I've tried the following with no success:

Remove the LACP link aggregation
Rename the server that the service cannot be connected on

What else can I try?
0
 
LVL 14

Expert Comment

by:Radweld
ID: 39638238
Is Windows firewall enabled (especially on the cluster network?) if so disable it. From each server can you connect to the \\servername\admin$ share? I presume this is a LAN cluster using networks in the same subnet.
0
 

Author Comment

by:Uniqueinc
ID: 39638461
Radweld,

Windows Firewall is disabled.  I'll have my colleagues check to see if we can connect to the share you specify.  I do know that we can connect to \\servername\c$.
0
Is Your DevOps Pipeline Leaking?

Is your CI/CD pipeline a hodge-podge of randomly connected tools? You’ve likely got a tool to fix one problem & then a different tool to fix another, resulting in a cluster of tools with overlapping functionality. Learn how to optimize your pipeline with Gartner's recommendations

 
LVL 14

Expert Comment

by:Radweld
ID: 39638518
Interested in how they were rebuilt, were they sysprepped or fully re-installed? Did you run WINRM Quickonfig to quickly configure windows remote management?
0
 

Author Comment

by:Uniqueinc
ID: 39638542
I fully re-installed them.  I did not run WINRM Quickconfig.  I've never heard of that.
0
 
LVL 14

Expert Comment

by:Radweld
ID: 39638553
0
 

Author Comment

by:Uniqueinc
ID: 39645610
Radweld,

I'm no longer at the site where the servers are located.  I will be able to go backj there the first week of Dec.  

However, I will try to get me colleagues that are there to follow the steps you provided.

Thanks.
0
 
LVL 14

Expert Comment

by:Radweld
ID: 39647293
Also make sure IPV6 is still enabled (technically bound)
0
 

Author Comment

by:Uniqueinc
ID: 39647679
Why does IPV6 need to be enabled?  Since there is no IPv6 DHCP server on the network we've had issues caused by the additional traffic from machines searching for an IPv6 DHCP server.
0
 
LVL 14

Expert Comment

by:Radweld
ID: 39647830
Windows 2008R2 and later uses IPV6 by default, the servers will prioritise over IPv6 instead of IPv4 unless you hacked the registry to promote IPV4. Usually peoples botched efforts to disable IPv6 results in issues like this, I've seen similar issues configuring Exchange 2010 DAGS.
0
 

Author Comment

by:Uniqueinc
ID: 39647839
So, do all I need to do is re-enable the IPv6 protocol on the network adapter(s)?
0
 
LVL 14

Expert Comment

by:Radweld
ID: 39647848
I'm not saying having IPv6 unbound from the adapters will resolve your issue but it might be contributing. IPv6 should never be unbound unless you have a pretty good reason.

http://blogs.technet.com/b/jlosey/archive/2011/02/02/why-you-should-leave-ipv6-alone.aspx
0
 

Accepted Solution

by:
Uniqueinc earned 0 total points
ID: 39885671
I ended up re-imaging the server and making sure all the Windows updates were installed.  That seemed to fix the issue.
0
 

Author Closing Comment

by:Uniqueinc
ID: 39898325
Re-imaging a machine is usually  a last resort option.
0

Featured Post

Is Your AD Toolbox Looking More Like a Toybox?

Managing Active Directory can get complicated.  Often, the native tools for managing AD are just not up to the task.  The largest Active Directory installations in the world have relied on one tool to manage their day-to-day administration tasks: Hyena. Start your trial today.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Background Information Recently I have fixed file server permission issues for one of my client. The client has 1800 users and one Windows Server 2008 R2 domain joined file server with 12 TB of data, 250+ shared folders and the folder structure i…
Did you know that more than 4 billion data records have been recorded as lost or stolen since 2013? It was a staggering number brought to our attention during last week’s ManageEngine webinar, where attendees received a comprehensive look at the ma…
This tutorial will walk an individual through the process of configuring their Windows Server 2012 domain controller to synchronize its time with a trusted, external resource. Use Google, Bing, or other preferred search engine to locate trusted NTP …
This video shows how to use Hyena, from SystemTools Software, to update 100 user accounts from an external text file. View in 1080p for best video quality.

752 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question