Solved

Exchange 2007 Cluster - Outage Causing "Split Brain" - Lost Emails

Posted on 2012-03-28
2
570 Views
Last Modified: 2012-03-28
We have a two node Exchange cluster (CCR) that experienced a non controlled outage on one of the nodes today (outage at 1pm). We have 5 Information Stores, and four of them came up on the Passive node with no problem. The fifth looked at first like it worked but then we realized that it was missing emails from the last week.

We then realized that on the Node B (Originally passive, now active) it showed that Information Store as "Initalizing" and the Information Store Logs had not been replicated for over a week from Node A.

We tried to dismount (and suspend replication) the broken Information Store and transfer back to Node A (used the -IgnoreDismount switch) and it worked but we then had all emails prior to the outage but none between 1pm and 5pm when we took down the server again for maintenance. We then realized that the log files were re-creating on Node A and were conflicting with the log files on Node B. At this point I think we have a "Split Brain" convergence of the Information Stores. We have a backup and snapshot of both Node A and Node B .edb and log files before we started to troubleshoot this problem so we could roll back.


Are there any options? Is the best option to get Node A working up to 1pm, re-seed Node B, then use the backup of Node B .edb and log files into a DR environment and then just export the changes since 1 pm in mail to a PST (and give to the users)? Are there better options? Could we use the Exchange Recovery Group?
0
Comment
Question by:rsp_it
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 2
2 Comments
 

Author Comment

by:rsp_it
ID: 37776213
We don't think an eseutil /r will work on Node A since the log file file names are the same as Node B. It feels like a split brain and our only option is to choose one of the versions of the DB and then go from there trying to recover as much data as we can between 1pm and 5pm.
0
 

Accepted Solution

by:
rsp_it earned 0 total points
ID: 37780198
We ended up resolving this ourselves by moving back to Node A, losing information between 1pm and 5pm. Then we used Recovery Storage Groups with Node B's database version and the "merge" functionality to get the old emails between 1pm and 5pm back.
0

Featured Post

Office 365 Training for IT Pros

Learn how to provision tenants, synchronize on-premise Active Directory, implement Single Sign-On, customize Office deployment, and protect your organization with eDiscovery and DLP policies.  Only from Platform Scholar.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This article aims to explain the working of CircularLogArchiver. This tool was designed to solve the buildup of log file in cases where systems do not support circular logging or where circular logging is not enabled
In-place Upgrading Dirsync to Azure AD Connect
how to add IIS SMTP to handle application/Scanner relays into office 365.
There are cases when e.g. an IT administrator wants to have full access and view into selected mailboxes on Exchange server, directly from his own email account in Outlook or Outlook Web Access. This proves useful when for example administrator want…
Suggested Courses

632 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question