Link to home
Start Free TrialLog in
Avatar of rstuemke
rstuemke

asked on

Cannot Add 2nd Exchange Server to EX 2010 Database Availability Group

Hello,

I have successfully created a DAG1 on EX2010 SP3, Win 2012 Server.   Added 1st EX (EX01)server with no problem.   Cannot add a 2nd server.    Tried two EX (EX02 and EX03) servers running same level of software.  They continually fail.    Deleted DAG1 and recreated, trying again, with the same results.   So, created a new, different DAG2.   Added as the 1st server, one of the ones (EX02) which failed when I tried to added to the DAG1.   It added ok as the 1st.   Then tried to add the 2nd EX (EX03) server and it fails the same way.   (Not the real names for simplification)

>>>>>>>>>>>>>>>>>>>>>>>Here is where I added the 1st server.

PS] C:\>add-databaseavailabilitygroupserver -mailboxserver 1730w107048721 -identity LocalDAG


That worked fine.  No Errors.   Shows as member of DAG, network ok, etc.

>>>>>>>>>>>>>>>>>>>>Here is trying to add the 2nd server:

[PS] C:\>add-databaseavailabilitygroupserver -mailboxserver 1730w50rxbx1 -identity LocalDAG
WARNING: The operation wasn't successful because an error was encountered. You may find more details in log file
"C:\ExchangeSetupLogs\DagTasks\dagtask_2013-06-26_18-12-56.308_add-databaseavailabiltygroupserver.log".
A database availability group administrative operation failed.
Error: The operation failed.
CreateCluster errors may result from incorrectly configured static addresses.
Error: An error occurred while attempting a cluster operation.
Error : Cluster API '"AddClusterNode() (MaxPercentage=100) failed with 0x5b4.
Error: This operation returned because the timeout period expired"' failed.
   [Server: 1730W107048721.calvaryspringfield.org]
    + CategoryInfo          : InvalidArgument: (:)
                 [Add-DatabaseAvailabilityGroupServer], DagTaskOperationFailedException
    + FullyQualifiedErrorId : 55801767,Microsoft.Exchange.Management.SystemConfigurationTasks.AddDatabase
AvailabilityGroupServer
    + PSComputerName        : 1730w107048721.calvaryspringfield.org



>>>>>>>>>>>>>>>Here is the DAG log info:

add-databaseavailabiltygroupserver started on machine 1730W107048721.
[2013-06-26T18:12:56] add-dagserver started
[2013-06-26T18:12:56] commandline:         $scriptCmd = {& $wrappedCmd @PSBoundParameters }
[2013-06-26T18:12:56] Option 'Identity' = 'LocalDAG'.
[2013-06-26T18:12:56] Option 'MailboxServer' = '1730w50rxbx1'.
[2013-06-26T18:12:56] Option 'DatabaseAvailabilityGroupIpAddresses' = ''.
[2013-06-26T18:12:56] Option 'WhatIf' = ''.
[2013-06-26T18:12:56] Process: w3wp w3wp.exe:2140.
[2013-06-26T18:12:56] User context = 'NT AUTHORITY\SYSTEM'.
[2013-06-26T18:12:56]   Member of group 'Everyone'.
[2013-06-26T18:12:56]   Member of group 'BUILTIN\Users'.
[2013-06-26T18:12:56]   Member of group 'NT AUTHORITY\SERVICE'.
[2013-06-26T18:12:56]   Member of group 'CONSOLE LOGON'.
[2013-06-26T18:12:56]   Member of group 'NT AUTHORITY\Authenticated Users'.
[2013-06-26T18:12:56]   Member of group 'NT AUTHORITY\This Organization'.
[2013-06-26T18:12:56]   Member of group 'BUILTIN\IIS_IUSRS'.
[2013-06-26T18:12:56]   Member of group 'LOCAL'.
[2013-06-26T18:12:56]   Member of group 'IIS APPPOOL\MSExchangePowerShellAppPool'.
[2013-06-26T18:12:56]   Member of group 'BUILTIN\Administrators'.
[2013-06-26T18:12:56] Updated Progress 'Validating the parameters.' 2%.
[2013-06-26T18:12:56] Working
[2013-06-26T18:12:57] Mailbox server: value passed in = 1730w50rxbx1, mailboxServer.Name = 1730W50RXBX1, mailboxServer.Fqdn = 1730W50RXBX1.calvaryspringfield.org
[2013-06-26T18:12:57] LogClussvcState: clussvc is Stopped on 1730W50RXBX1.calvaryspringfield.org. Exception (if any) = none
[2013-06-26T18:12:57] The IP addresses for the DAG are (blank means DHCP): 172.16.1.166
[2013-06-26T18:12:57] Looking up IP addresses for LocalDAG.
[2013-06-26T18:12:57]   LocalDAG = [ 172.16.1.166 ].
[2013-06-26T18:12:57] Looking up IP addresses for 1730w50rxbx1.
[2013-06-26T18:12:57]   1730w50rxbx1 = [ 172.16.1.66 ].
[2013-06-26T18:12:57] Looking up IP addresses for 1730W50RXBX1.calvaryspringfield.org.
[2013-06-26T18:12:57]   1730W50RXBX1.calvaryspringfield.org = [ 172.16.1.66 ].
[2013-06-26T18:12:57] DAG LocalDAG has 1 servers:
[2013-06-26T18:12:57] DAG LocalDAG contains server 1730W107048721.
[2013-06-26T18:12:57] Updated Progress 'Checking if Mailbox server '1730W50RXBX1' is in a database availability group.' 4%.
[2013-06-26T18:12:57] Working
[2013-06-26T18:12:57] GetRemoteCluster() for the mailbox server failed with exception = An Active Manager operation failed. Error An error occurred while attempting a cluster operation. Error: Cluster API '"OpenCluster(1730W50RXBX1.calvaryspringfield.org) failed with 0x6d9. Error: There are no more endpoints available from the endpoint mapper"' failed... This is OK.
[2013-06-26T18:12:57] Ignoring previous error, as it is acceptable if the cluster does not exist yet.
[2013-06-26T18:12:57] DumpClusterTopology: Opening remote cluster LocalDAG.
[2013-06-26T18:12:57] Dumping the cluster by connecting to: LocalDAG.
[2013-06-26T18:12:57] The cluster's name is: LocalDAG.
[2013-06-26T18:12:57] Groups
[2013-06-26T18:12:57]     group: Available Storage [not a CMS]
[2013-06-26T18:12:57]         OwnerNode: 1730W107048721.calvaryspringfield.org
[2013-06-26T18:12:57]         State: Offline
[2013-06-26T18:12:57]     group: Cluster Group [Cluster Main Group]
[2013-06-26T18:12:57]         OwnerNode: 1730W107048721.calvaryspringfield.org
[2013-06-26T18:12:57]         State: Online
[2013-06-26T18:12:57]             Resource: Cluster IP Address [Online, type = IP Address, PossibleOwners = 1730W107048721 ]
[2013-06-26T18:12:57]                 Address = [172.16.1.166]
[2013-06-26T18:12:57]                     EnableDhcp = [0]
[2013-06-26T18:12:57]                     Network = [Cluster Network 1]
[2013-06-26T18:12:57]             Resource: Cluster Name [Online, type = Network Name, PossibleOwners = 1730W107048721 ]
[2013-06-26T18:12:57]                 NetName = [LOCALDAG]
[2013-06-26T18:12:57] Nodes
[2013-06-26T18:12:57]     node: 1730W107048721.calvaryspringfield.org [ state = Up ]
[2013-06-26T18:12:57] Subnets
[2013-06-26T18:12:57]     Name(Cluster Network 1), Mask(172.16.1.0/24), Role(ClusterNetworkRoleInternalAndClient)
[2013-06-26T18:12:57]         NIC 172.16.1.135 on Node 1730W107048721 in State=Up
[2013-06-26T18:12:57]     Name(Cluster Network 2), Mask(172.16.2.0/24), Role(ClusterNetworkRoleInternalUse)
[2013-06-26T18:12:57]         NIC 172.16.2.135 on Node 1730W107048721 in State=Up
[2013-06-26T18:12:57] Opening the cluster on nodes [1730w107048721].
[2013-06-26T18:12:57] Other mailbox servers in the DAG are already members of cluster 'LocalDAG'
[2013-06-26T18:12:57] The server 1730W50RXBX1 does not belong to a cluster, and the other servers belong to LocalDAG.
[2013-06-26T18:12:57] Successfully resolved the servers based on the stopped servers list.
[2013-06-26T18:12:57] The following servers are in the StartedServers list (The list is the StartedServers property of the DAG in AD):
[2013-06-26T18:12:57] The following servers are in the StoppedServers list:
[2013-06-26T18:12:57] Verifiying that the members of database availability group 'LocalDAG' are also members of the cluster.
[2013-06-26T18:12:57] Verifying that the members of cluster 'LocalDAG' are also members of the database availability group.
[2013-06-26T18:12:57] According to GetNodeClusterState(), the server 1730W50RXBX1 is NotConfigured.
[2013-06-26T18:12:57] The CNO is currently Online.
[2013-06-26T18:12:57] InternalValidate() done.
[2013-06-26T18:12:57] Updated Progress 'Adding server '1730W50RXBX1' to database availability group 'LocalDAG'.' 6%.
[2013-06-26T18:12:57] Working
[2013-06-26T18:12:57] Updated Progress 'Adding server '1730W50RXBX1' to the cluster.' 8%.
[2013-06-26T18:12:57] Working
[2013-06-26T18:19:14] The following log entry comes from a different process that's running on machine '1730W107048721.calvaryspringfield.org'. BEGIN
[2013-06-26T18:19:14] [2013-06-26T18:12:57] Opening a local AmCluster handle.
[2013-06-26T18:12:57] Updated Progress 'Adding server '1730w50rxbx1' to database availability group 'LocalDAG'.' 2%.
[2013-06-26T18:12:57] Working
[2013-06-26T18:12:57] ClusterSetupProgressCallback( eSetupPhase = ClusterSetupPhaseValidateNodeState, ePhaseType = ClusterSetupPhaseStart, ePhaseSeverity = ClusterSetupPhaseInformational, dwPercentComplete = 12, szObjectName = 1730W50RXBX1, dwStatus = 0x0 )
[2013-06-26T18:12:57] ClusterSetupProgressCallback( eSetupPhase = ClusterSetupPhaseValidateNodeState, ePhaseType = ClusterSetupPhaseEnd, ePhaseSeverity = ClusterSetupPhaseInformational, dwPercentComplete = 12, szObjectName = , dwStatus = 0x0 )
[2013-06-26T18:12:57] ClusterSetupProgressCallback( eSetupPhase = ClusterSetupPhaseGettingCurrentMembership, ePhaseType = ClusterSetupPhaseStart, ePhaseSeverity = ClusterSetupPhaseInformational, dwPercentComplete = 25, szObjectName = LocalDAG, dwStatus = 0x0 )
[2013-06-26T18:12:57] ClusterSetupProgressCallback( eSetupPhase = ClusterSetupPhaseGettingCurrentMembership, ePhaseType = ClusterSetupPhaseEnd, ePhaseSeverity = ClusterSetupPhaseInformational, dwPercentComplete = 25, szObjectName = , dwStatus = 0x0 )
[2013-06-26T18:12:57] ClusterSetupProgressCallback( eSetupPhase = ClusterSetupPhaseAddNodeToCluster, ePhaseType = ClusterSetupPhaseStart, ePhaseSeverity = ClusterSetupPhaseInformational, dwPercentComplete = 37, szObjectName = 1730W50RXBX1, dwStatus = 0x0 )
[2013-06-26T18:12:57] ClusterSetupProgressCallback( eSetupPhase = ClusterSetupPhaseAddNodeToCluster, ePhaseType = ClusterSetupPhaseEnd, ePhaseSeverity = ClusterSetupPhaseInformational, dwPercentComplete = 37, szObjectName = , dwStatus = 0x0 )
[2013-06-26T18:12:57] ClusterSetupProgressCallback( eSetupPhase = ClusterSetupPhaseValidateNetft, ePhaseType = ClusterSetupPhaseStart, ePhaseSeverity = ClusterSetupPhaseInformational, dwPercentComplete = 50, szObjectName = 1730W50RXBX1, dwStatus = 0x0 )
[2013-06-26T18:12:57] ClusterSetupProgressCallback( eSetupPhase = ClusterSetupPhaseValidateNetft, ePhaseType = ClusterSetupPhaseEnd, ePhaseSeverity = ClusterSetupPhaseInformational, dwPercentComplete = 50, szObjectName = , dwStatus = 0x0 )
[2013-06-26T18:12:57] ClusterSetupProgressCallback( eSetupPhase = ClusterSetupPhaseValidateClusDisk, ePhaseType = ClusterSetupPhaseStart, ePhaseSeverity = ClusterSetupPhaseInformational, dwPercentComplete = 62, szObjectName = 1730W50RXBX1, dwStatus = 0x0 )
[2013-06-26T18:12:57] ClusterSetupProgressCallback( eSetupPhase = ClusterSetupPhaseValidateClusDisk, ePhaseType = ClusterSetupPhaseEnd, ePhaseSeverity = ClusterSetupPhaseInformational, dwPercentComplete = 62, szObjectName = , dwStatus = 0x0 )
[2013-06-26T18:12:57] ClusterSetupProgressCallback( eSetupPhase = ClusterSetupPhaseConfigureClusSvc, ePhaseType = ClusterSetupPhaseStart, ePhaseSeverity = ClusterSetupPhaseInformational, dwPercentComplete = 75, szObjectName = 1730W50RXBX1, dwStatus = 0x0 )
[2013-06-26T18:12:57] ClusterSetupProgressCallback( eSetupPhase = ClusterSetupPhaseConfigureClusSvc, ePhaseType = ClusterSetupPhaseEnd, ePhaseSeverity = ClusterSetupPhaseInformational, dwPercentComplete = 75, szObjectName = , dwStatus = 0x0 )
[2013-06-26T18:12:57] ClusterSetupProgressCallback( eSetupPhase = ClusterSetupPhaseStartingClusSvc, ePhaseType = ClusterSetupPhaseStart, ePhaseSeverity = ClusterSetupPhaseInformational, dwPercentComplete = 87, szObjectName = 1730W50RXBX1.calvaryspringfield.org, dwStatus = 0x0 )
[2013-06-26T18:12:57] ClusterSetupProgressCallback( eSetupPhase = ClusterSetupPhaseStartingClusSvc, ePhaseType = ClusterSetupPhaseEnd, ePhaseSeverity = ClusterSetupPhaseInformational, dwPercentComplete = 87, szObjectName = , dwStatus = 0x0 )
[2013-06-26T18:12:57] ClusterSetupProgressCallback( eSetupPhase = ClusterSetupPhaseNodeUp, ePhaseType = ClusterSetupPhaseStart, ePhaseSeverity = ClusterSetupPhaseInformational, dwPercentComplete = 100, szObjectName = 1730W50RXBX1, dwStatus = 0x0 )
[2013-06-26T18:15:57] ClusterSetupProgressCallback( eSetupPhase = ClusterSetupPhaseNodeUp, ePhaseType = ClusterSetupPhaseEnd, ePhaseSeverity = ClusterSetupPhaseFatal, dwPercentComplete = 100, szObjectName = 1730W50RXBX1, dwStatus = 0x5b4 )
[2013-06-26T18:15:57] ClusterSetupProgressCallback( eSetupPhase = ClusterSetupPhaseFailureCleanup, ePhaseType = ClusterSetupPhaseStart, ePhaseSeverity = ClusterSetupPhaseInformational, dwPercentComplete = 100, szObjectName = 1730W50RXBX1, dwStatus = 0x0 )
[2013-06-26T18:19:14] ClusterSetupProgressCallback( eSetupPhase = ClusterSetupPhaseFailureCleanup, ePhaseType = ClusterSetupPhaseEnd, ePhaseSeverity = ClusterSetupPhaseInformational, dwPercentComplete = 100, szObjectName = , dwStatus = 0x0 )

[2013-06-26T18:19:14] The preceding log entry comes from a different process running on computer '1730W107048721.calvaryspringfield.org'. END
[2013-06-26T18:19:14] The operation wasn't successful because an error was encountered. You may find more details in log file "C:\ExchangeSetupLogs\DagTasks\dagtask_2013-06-26_18-12-56.308_add-databaseavailabiltygroupserver.log".
[2013-06-26T18:19:14] WriteError! Exception = Microsoft.Exchange.Cluster.Replay.DagTaskOperationFailedException: A database availability group administrative operation failed. Error: The operation failed. CreateCluster errors may result from incorrectly configured static addresses. Error: An error occurred while attempting a cluster operation. Error: Cluster API '"AddClusterNode() (MaxPercentage=100) failed with 0x5b4. Error: This operation returned because the timeout period expired"' failed. ---> Microsoft.Exchange.Cluster.Replay.AmClusterApiException: An Active Manager operation failed. Error An error occurred while attempting a cluster operation. Error: Cluster API '"AddClusterNode() (MaxPercentage=100) failed with 0x5b4. Error: This operation returned because the timeout period expired"' failed.. ---> System.ComponentModel.Win32Exception: This operation returned because the timeout period expired
   --- End of inner exception stack trace ---
   at Microsoft.Exchange.Cluster.ClusApi.AmCluster.AddNodeToCluster(AmServerName nodeName, IClusterSetupProgress setupProgress, IntPtr context, Exception& errorException, Boolean throwExceptionOnFailure)
   at Microsoft.Exchange.Cluster.Replay.DagHelper.AddDagClusterNode(AmServerName mailboxServerName, String& verboseLog)
   --- End of inner exception stack trace (Microsoft.Exchange.Cluster.Replay.AmClusterApiException) ---
   at Microsoft.Exchange.Cluster.Replay.DagHelper.ThrowDagTaskOperationWrapper(Exception exception)
   at Microsoft.Exchange.Cluster.Replay.DagHelper.AddDagClusterNode(AmServerName mailboxServerName, String& verboseLog)
   at Microsoft.Exchange.Cluster.ReplayService.ReplayRpcServer.<>c__DisplayClass34.<RpcsAddNodeToCluster>b__33()
   at Microsoft.Exchange.Data.Storage.Cluster.HaRpcExceptionWrapperBase`2.RunRpcServerOperation(String databaseName, RpcServerOperation rpcOperation)
   --- End of stack trace on server (1730W107048721.calvaryspringfield.org) ---
   at Microsoft.Exchange.Data.Storage.Cluster.HaRpcExceptionWrapperBase`2.ClientRethrowIfFailed(String databaseName, String serverName, RpcErrorExceptionInfo errorInfo)
   at Microsoft.Exchange.Cluster.Replay.ReplayRpcClientWrapper.RunRpcOperationDbName(AmServerName serverName, String databaseName, Int32 timeoutMs, IHaRpcExceptionWrapper rpcExceptionWrapperInstance, InternalRpcOperation rpcOperation)
   at Microsoft.Exchange.Cluster.Replay.ReplayRpcClientWrapper.RunRpcOperation(AmServerName serverName, Nullable`1 dbGuid, Int32 timeoutMs, IHaRpcExceptionWrapper rpcExceptionWrapperInstance, InternalRpcOperation rpcOperation)
   at Microsoft.Exchange.Cluster.Replay.ReplayRpcClientWrapper.RunAddNodeToCluster(AmServerName serverName, AmServerName newNode, String& verboseLog)
   at Microsoft.Exchange.Management.SystemConfigurationTasks.AddDatabaseAvailabilityGroupServer.JoinNodeToCluster()
[2013-06-26T18:19:14] Updated Progress 'Done!' 100%.
[2013-06-26T18:19:14] COMPLETED
add-databaseavailabiltygroupserver explicitly called CloseTempLogFile().
Avatar of rstuemke
rstuemke

ASKER

Have been researching for a couple of days.   See many others with same problem, but no real resolution to the problem.    Thanks for any help you can give me.
Avatar of Amit
CreateCluster errors may result from incorrectly configured static addresses means you have mismatch subnet issue between your DAG member server. Make sure to use same subnet for all member server.
hi.. is the windows server version the same on the servers? and did you pre-stage the cluster name object?
subnets are as follows:

EX01 - MAPI 172.16.1.141  mask 255.255.255.0
            Replication 172.16.2.141  (no gateway, no DNS or DNS registration)
                mask 255.255.255.0

EX02 - MAPI 172.16.1.135  mask 255.255.255.0
            Replication 172.16.2.135  (no gateway, no DNS or DNS registration)
                mask 255.255.255.0

EX03 - MAPI 172.16.1.66  mask 255.255.255.0
            Replication 172.16.2.66  (no gateway, no DNS or DNS registration)
                mask 255.255.255.0

DAG1 (CTCCDAG) IP 172.16.1.159
DAG2 (LocalDAG) IP 172.16.1.166

Computer objects CTCCDAG and LocalDAG set up as per instructions and disabled

The DAG Networks are all up.  Failover Cluster Manager shows clusters UP and Networks UP.   Cannot find any problems with the setup or the IP addresses.
That is what is so frustrating.
Also, FSW is online.
ASKER CERTIFIED SOLUTION
Avatar of Simon Butler (Sembee)
Simon Butler (Sembee)
Flag of United Kingdom of Great Britain and Northern Ireland image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
OK, Simon..... will give that a try.......  will update on Monday....
Simon,

I did what you suggested and IT WORKED!!   THANK YOU!!   I have a working DAG with 2 member servers.   Could not get the 3rd server to join, but will work on that some more.

I disabled the replication networks, which seemed to be the key.    A couple of quick follow-on questions.

1) Should I try to re-esstablish the replication network on a different subnet?
2) or should I configure them as MAPI NICs and just use them as normal additional NICS?

Please advise     Thanks.
Depends on your traffic flow.
If you find the NIC is being heavily utilised then use seperate NICs. However if the utilisation is low then leave it.
It also depends on whether the replication network is completely seperate. If it is going through the same switch then I don't see much point.

Simon.
Nice simple clear answer that worked.  Thanks.