Still celebrating National IT Professionals Day with 3 months of free Premium Membership. Use Code ITDAY17

x
?
Solved

Exchange 2010 mailbox server crashes when other MB server reboots

Posted on 2014-03-05
4
Medium Priority
?
2,736 Views
Last Modified: 2014-03-10
Issue: I have two servers in a DAG. When I move all active database copies to server B & reboot server A, all's fine.  When I move all active database copies to server A and reboot server B, all mailbox databases dismount.  They come back online as soon as Server B is back online.  

Environment:
- 2 Exchange 2010 SP3 mailbox servers in a DAG, 2 CAS/HT servers in NLB cluster
- Windows 2008 R2 Enterprise servers
- Running as VMs on two separate Windows 2012 Hyper-V Hosts
- Primary Witness Server is one CAS/HT server, Secondary Witness Server is the other CAS/HT server.

All health checks make it look like everything's in good working order (server health, replication, etc.)
---------------------------------------------
Errors:
Insight Manager (HP utility to monitor server health): [DAG] System is unreachable.
---------------------------------------------
CAS/HT server:

Warning 1022: MSExchange Transport
"The connection between the Client Access server and Mailbox server "[ServerB]" failed...

Microsoft.Exchange.Data.Storage.ConnectionFailedTransientException: Cannot open mailbox [mailboxname]. ---> Microsoft.Mapi.MapiExceptionLogonFailed: MapiExceptionLogonFailed: Unable to make connection to the server. (hr=0x80040111, ec=-2147221231)
Diagnostic context:"
---------------------------------------------
Critical Error 1016: MSExchange ActiveSync

Exchange ActiveSync has encountered repeated failures when it tries to access data on Mailbox server [ServerB]. It will temporarily stop making requests to the Mailbox server for [60] seconds to reduce load on that server. This delay may occur if the Mailbox server is overloaded. If this event is logged frequently, review the Application log on this server and the Mailbox server noted above for other events that could indicate the root cause of performance problems.
---------------------------------------------
Errors on ServerB:

Critical Error 4066: MSExchangeRepl

An error occurred while trying to write to the cluster database. Error: ClusterRegBatchClose failed with error 1726.

---------------------------------------------
Critical error 4082: MSExchangeRepl

The replication network manager encountered an error while monitoring events. Error: Microsoft.Exchange.Cluster.Replay.AmClusterApiException: An Active Manager operation failed. Error An error occurred while attempting a cluster operation. Error: Cluster API '"OpenCluster(ServerB) failed with 0x6d9. Error: There are no more endpoints available from the endpoint mapper"' failed.. ---> System.ComponentModel.Win32Exception: There are no more endpoints available from the endpoint mapper
   --- End of inner exception stack trace ---
   at Microsoft.Exchange.Cluster.Replay.NetworkManager.DriveMapRefresh()
   at Microsoft.Exchange.Cluster.Replay.NetworkManager.TryDriveMapRefresh()
---------------------------------------------

The DAG was created without issue, although it pre-existed on two physical servers.  We added ServerA to the DAG, retired a physical, added ServerB, retired 2nd physical.

The DAG has a static IP address which pings from both nodes.

Anyone have any ideas?  I'm quite concerned that if ServerA goes down I'm going to be dead in the water.
0
Comment
Question by:CHR3800
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 2
  • 2
4 Comments
 
LVL 42

Expert Comment

by:Adam Brown
ID: 39907891
1. With two Nodes, you should only have one Witness server in the configuration. Having 4 results in an even number of votes, which can cause problems.
2. Before rebooting server B, you'll want to make sure that all of the databases are in a healthy state. Run get-mailboxdatabase | get-mailboxdatabasecopystatus to view the status of all copies. If any of the database copies are in a state other than Healthy or Mounted, the database will enter a failed state when the server with the healthy copy fails.
3. Check Cluster services to make sure that each server has a vote in the quorum and that both servers are set as possible owners. This can also cause what you're seeing.
0
 

Accepted Solution

by:
CHR3800 earned 0 total points
ID: 39908102
Thanks for the response.

My issue ended up being that “The Alternate Witness Server itself does not provide any redundancy for the Witness Server, and DAGs do not dynamically switch witness servers, nor do they automatically start using the Alternate Witness Server in the event of a problem with the Witness Server.”  

So, in the Organization Config I'd defined primary & alternate witness servers, believing that when the primary went down the alternate would take over.  Apparently it doesn't work that way.  So, because I have the primary witness server on the same VM host as one of the mailbox servers, there was no way to establish a quorum when I took both down to patch the host.  The solution for me will be to create a primary witness server on a server that's not part of the Exchange VMs in any way.
0
 
LVL 42

Expert Comment

by:Adam Brown
ID: 39908200
"because I have the primary witness server on the same VM host as one of the mailbox servers" is something you should have mentioned, btw :D
0
 

Author Closing Comment

by:CHR3800
ID: 39917029
I'm accepting my own comment as the solution because it's the right one, which I'd found on my own before having it confirmed by another tech on another site. The one other response wasn't here wasn't helpful
0

Featured Post

Prepare for your VMware VCP6-DCV exam.

Josh Coen and Jason Langer have prepared the latest edition of VCP study guide. Both authors have been working in the IT field for more than a decade, and both hold VMware certifications. This 163-page guide covers all 10 of the exam blueprint sections.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This article will help to fix the below error for MS Exchange server 2010 I. Out Of office not working II. Certificate error "name on the security certificate is invalid or does not match the name of the site" III. Make Internal URLs and External…
If you troubleshoot Outlook for clients, you may want to know a bit more about the OST file before doing your next job. IMAP can cause a lot of drama if removed in the accounts without backing up.
In this Micro Video tutorial you will learn the basics about Database Availability Groups and How to configure one using a live Exchange Server Environment. The video tutorial explains the basics of the Exchange server Database Availability grou…
This video shows how to quickly and easily add an email signature for all users on Exchange 2016. The resulting signature is applied on a server level by Exchange Online. The email signature template has been downloaded from: www.mail-signatures…
Suggested Courses

715 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question