Link to home
Start Free TrialLog in
Avatar of EricDaRed
EricDaRed

asked on

Remote Exchange 2010 DAG Node has Fail-over cluster fail often

Alright here we go.

We have an Exchange 2010 environment. 4 servers: CAS, Two local Mailbox Servers, and a Remote Mailbox Server.

The 3 mailbox servers are in a DAG. One of the two local mailbox servers had a mailbox copy but thats in a failed and suspended status. Long story short each server has its own database that it's using for storage. The remote site Failover Clustering Node keeps failing about once a week. I think its from latency and other miscellaneous network problems.

My options are to juggle the servers in order to get that remote server back online (restarting services all over) and finally having that remote node come up (no rhyme or reason why it starts besides restarted Exchange and cluster service) but that usually restores it.

My second option is to "Manage Database Availibility Group" and select to remove this remote server from the DAG. (Also can i do this while the node is down??

What would you recommend I do in this situation? There are no database copies to speak of so that shouldn't be an issue.

Thank you
ASKER CERTIFIED SOLUTION
Avatar of Systech Admin
Systech Admin
Flag of India image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of EricDaRed
EricDaRed

ASKER

Thanks for the feedback. I had found most of what you suggested already, I was hoping someone had experience with this particular issue. The quorum is configured in majority I believe, it does use a witness server. When doing snooping with the log files i am finding errors. I suspect that the latency between the two sites is too great during some nights (backups and such) and cause a problem that is really only resolved by restarting all of the clustering and exchange services on the upstream servers. I suspect that these servers have not been patched in a long time and there are other issues there.
in your case it seems that you have another Active Directory Site but with the same namespace ; if YES be sure that you configured the Datacenter Activation Coordination  mode to (DAC) to DagOnly , because you are issuing a split Brain Condition , moreover to get more information on how  DAC works review the below link:

http://exchangeserverpro.com/datacenter-activation-coordination-mode/

to configure DAC run the below command on Exchange Management Shell

Set-DatabaseAvailabilityGroup -Identity "DAG-Name" -DatacenterActivationMode DagOnly