exchange 2013 DAG switching over for unknown reason

billFmurray
billFmurray used Ask the Experts™
on
I'm having some issues with our DAG.  

I have 2 mailbox nodes in one site and 2 in another (DR).  All vmware.

In the primary "site1",  I had 10 of our databases active on one mailbox server (site1mbx1) and 10 on the other (site1mbx2).

Everything seems to work...except a few times, I've found all 20 databases active on site1mbx1.  Not sure what's causing the switchover.

I get these in event log....

event 4401
Microsoft Exchange Server Locator Service failed to find active server for database '6b96ab02-8736-46dd-be89-7719c0e1077a'. Error: An Active Manager operation failed. Error: Invalid Active Manager configuration. Error: The Cluster service is not running.

event 2153
The log copier was unable to communicate with server 'site1MBX1.walkerdunlop.com'. The copy of database 'ARC01\site1EXMBX2' is in a disconnected state. The communication error was: An error occurred while communicating with server 'WDASHEXMBX1'. Error: Unable to read data from the transport connection: An existing connection was forcibly closed by the remote host. The copier will automatically retry after a short delay.

cluster event log
event 1135
Cluster node 'site1mbx1' was removed from the active failover cluster membership. The Cluster service on this node may have stopped. This could also be due to the node having lost communication with other active nodes in the failover cluster. Run the Validate a Configuration wizard to check your network configuration. If the condition persists, check for hardware or software errors related to the network adapters on this node. Also check for failures in any other network components to which the node is

Not sure if these are cause or effect (looks like the latter)


I ran test-replicationhealth--all passed.

I also ran the Get-DatabaseAvailabilityGroup.  I noticed it said "witnessShareinuse: none" is that normal?

[PS] C:\Windows\system32>Get-DatabaseAvailabilityGroup -Identity DAG1 -Status | fl


RunspaceId                             : 604443bd-f55f-4b52-a232-c03b1e1d9e3b
Name                                   : DAG1
Servers                                : {SITE2EXMBX2, SITE2EXMBX1, SITE1EXMBX2, SITE1EXMBX1}
WitnessServer                          : site1excas2.CONTOSO.com
WitnessDirectory                       : c:\fsw
AlternateWitnessServer                 :
AlternateWitnessDirectory              :
NetworkCompression                     : InterSubnetOnly
NetworkEncryption                      : InterSubnetOnly
ManualDagNetworkConfiguration          : True
DatacenterActivationMode               : Off
StoppedMailboxServers                  : {}
StartedMailboxServers                  : {}
DatabaseAvailabilityGroupIpv4Addresses : {192.168.11.179, 192.168.1.179}
DatabaseAvailabilityGroupIpAddresses   : {192.168.11.179, 192.168.1.179}
AllowCrossSiteRpcClientAccess          : False
OperationalServers                     : {SITE1EXMBX1, SITE1EXMBX2, SITE2EXMBX1, SITE2EXMBX2}
PrimaryActiveManager                   : SITE1EXMBX1
ServersInMaintenance                   : {}
ServersInDeferredRecovery              : {}
ThirdPartyReplication                  : Disabled
ReplicationPort                        : 64327
NetworkNames                           : {MapiDagNetwork, ReplicationDagNetwork01, ReplicationDagNetwork02}
WitnessShareInUse                      : None
DatabaseAvailabilityGroupConfiguration :
AutoDagSchemaVersion                   : 1.0
AutoDagDatabaseCopiesPerDatabase       : 1
AutoDagDatabaseCopiesPerVolume         : 1
AutoDagTotalNumberOfDatabases          : 0
AutoDagTotalNumberOfServers            : 0
AutoDagDatabasesRootFolderPath         : C:\ExchangeDatabases
AutoDagVolumesRootFolderPath           : C:\ExchangeVolumes
AutoDagAllServersInstalled             : False
AutoDagAutoReseedEnabled               : True
AutoDagDiskReclaimerEnabled            : True
ReplayLagManagerEnabled                : False
AdminDisplayName                       :
ExchangeVersion                        : 0.10 (14.0.100.0)
DistinguishedName                      : CN=DAG1,CN=Database Availability Groups,CN=Exchange Administrative Group (FYDIBOHF23SPDLT),CN=Administrative Groups,CN=CONTOSO,CN=Microsoft
                                         Exchange,CN=Services,CN=Configuration,DC=CONTOSO,DC=com
Identity                               : DAG1
Guid                                   : 08153f9b-a97d-412c-8ccc-aceba6136087
ObjectCategory                         : CONTOSO.com/Configuration/Schema/ms-Exch-MDB-Availability-Group
ObjectClass                            : {top, msExchMDBAvailabilityGroup}
WhenChanged                            : 4/10/2014 11:21:33 AM
WhenCreated                            : 3/28/2014 7:12:29 PM
WhenChangedUTC                         : 4/10/2014 3:21:33 PM
WhenCreatedUTC                         : 3/28/2014 11:12:29 PM
OrganizationId                         :
OriginatingServer                      : SITE1DC01.CONTOSO.com
IsValid                                : True
ObjectState                            : Unchanged

Any ideas?
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
Solution Architect
Most Valuable Expert 2014
Top Expert 2014
Commented:
To clarify, this is a single DAG, with 4 mailboxes servers. It is split across a WAN connection (what is the speed and latency on the link?).

Which server hosts your witness share? And what site is it in?

The fact it does not see a witness share is concerning. Without that visibility the even number of servers won't be able to complete elections (even number of votes).

Instructions on how to add a witness share if you don't have one.
http://exchangeserverpro.com/using-a-non-exchange-server-as-an-exchange-2013-dag-file-share-witness/

Author

Commented:
it's a pretty fast connection--100GB--and not geographically too far.

the witness share is in the primary site(1).

but even without a witness share, it shouldn't fail over automatically...?

Author

Commented:
and yes it's 1 DAG

Author

Commented:
it turned out to be a bug resolved in cu5
Gareth GudgerSolution Architect
Most Valuable Expert 2014
Top Expert 2014

Commented:
Glad you got it fixed. Not sure why I lost sight of this one. Pesky notifications system!

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial