Link to home
Start Free TrialLog in
Avatar of serg2626
serg2626

asked on

Recover from Failed Exchange 2010 Server with DAG

Hello,
I have 2 exchange server 2010 with DAG. They are both in different AD sites. The DAG replications from the primary location to the DR location normally. The primary location is always the mounted database and the DR is healthy.

My primary exchange server failed completely and it is unrecoverable, I need to re-install an OS and Exchange. I found procedures on re-adding the failed exchange by reinstalling the OS and reinstalling exchange with the recover command, but my concern is when I bring it back up, will the DR site database thats active at the moment automatically copy its contents to the primary site exchange once its up and I complete the recover procedure? I will be bringing it back up with the same server name and IP.

I would really appreciate the help.

Thank you.
Avatar of Manpreet SIngh Khatra
Manpreet SIngh Khatra
Flag of India image

will the DR site database thats active at the moment automatically copy its contents to the primary site exchange once its up and I complete the recover procedure - No this will not happen automatically.

Once the server is up with same details you need to add it as a DAG member and then make it Add it to Database copy and Seed first time.

-Rancy
Avatar of serg2626
serg2626

ASKER

Do you have some type of clear and easy procedure i can follow from installing the exchange server again to making it work as you stated in order for it to be a dag member again and have it do the database copy etc?

Thank you
I can work on it explaining might be very difficult so maybe will share some articles, if you have any specific queries i can answer them.

Adding a DAG member is something like first time you did nothing different. Its not Database copy its more of Database replication or Re-seeding ..... that's what its called.

Manage Database Availability Group Membership
http://technet.microsoft.com/en-us/library/dd351278.aspx

Configuring Exchange 2010 Database Availability Groups
http://howtoexchange.wordpress.com/2009/12/06/configuring-exchange-2010-database-availability-groups/

Configuring Database Availability Group In Exchange 2010… (Screenshots)
http://www.howexchangeworks.com/2009/07/configuring-database-availability-group.html


Uncovering Exchange 2010 Database Availability Groups (DAGs) (Part 1) (Total 4 Parts)
http://www.msexchange.org/articles_tutorials/exchange-server-2010/high-availability-recovery/uncovering-exchange-2010-database-availability-groups-dags-part1.html


Database Availability Group (DAG)
http://www.exchange-genie.com/2009/04/database-availability-group-dag-exchange-2010/

- Rancy
I appreciate the articles for the explanations, but at this point I just need a straight through procedure on what to do to bring the failed server back and functioning as it was. The articles will take me hours to go through..

For example, i'm going to follow these procedures:
http://exchangeserverpro.com/exchange-recovery-failed-dag-member-exchange-server-2010

I guess i'm just trying to make sure that these procedures will work well with my situation as I listed it above. I want to make sure that once I do this the database will resync back to the exchange server i'm going to recover. Or, are there other procedures I need to follow in order to make this happen how I need it.

I appreciate the help.

Thank you
Awesome !! ur the Google man :)

The article explains good with almost whats needed ... I think this should take care of the issue and if everything goes as smooth you shouldnt face any issues.

However if anything I am here :)

- Rancy
lol.. ok, so if I follow that article, the database from the DR site exchange server should resync back to the recovered exchange server, correct? If so, do I have to wait for the copy queue to go down or something before I switch users back to the recovered mailbox server?

Thanks
Yes .... it will resync but will take time depending on the DB size .... if not you can use Suspend and then Update commands available in EMC.

You will have to wait till status goes to Healthy :)

- Rancy
Ok, great. I'm going to start recovering it today and i'll keep you posted on the outcome.

Thank you..
Your Welcome and await your update :)
Randy,
I have a question for you. When I add the recovered exchange to the DAG will users loose connection or access to the active exchange server at any point?

I was able to complete all of the procedures except for installing exchange with the recover option becuase I didnt have the correct build of the exchange server CD available, so i'll have to wait until Monday to do it. But I have this concern and wanted to make sure before I add it on Monday when all the users are connected.

Thank you again!
Hello,
I have a question. I added the recovered exchange server to the DAG and am ready to enable database copy, but have a question.

All our users are currently connected to the DR exchange server and that is the active one. When I look at the current activation preference, it is set to 1. If I add the recovered exchange server for database copy and set the activation preference to 1, will it start copying the blank database to the DR exchange server? I'm affraid it might overwrite the good database if I do this. I need the recovered exchange server to have a preference of 1, thats how it was before it crashed.

Thank you.
When I add the recovered exchange to the DAG will users loose connection or access to the active exchange server at any point? -- NO :)

Currently dont make any preferences till Database copies are synced on both servers and we are good :)

- Rancy
Hello Rancy,
We started the database copy using the command in the article and it began successfully, but when we arrived this morning, we had an error message saying failed and suspended. Can you please help in what we should do next? I generated the below using the get-mailboxdatabasecopystatus | fl command.

[PS] C:\>Get-MailboxDatabaseCopyStatus | fl


RunspaceId                       : db88449c-8a8a-449d-bf34-9d9afbbd91bf
Identity                         : Mailbox Database 0520255\MAIL01
Name                             : Mailbox Database 0520255\MAIL01
DatabaseName                     : Mailbox Database 0520255
Status                           : FailedAndSuspended
MailboxServer                    : MAIL01
ActiveDatabaseCopy               : mail03
ActivationSuspended              : True
ActionInitiator                  : Service
ErrorMessage                     : The Microsoft Exchange Replication service encountered an error while inspecting the
                                    logs and database for Mailbox Database 0520255886\MAIL01 on startup. Error: File
                                    check failed : Database file 'D:\MAILBOXDB\Mailbox Database 0520255.edb' was not
                                    found.

ErrorEventId                     : 2070
ExtendedErrorInfo                :
SuspendComment                   : The database copy was automatically suspended due to failure item processing. At '8/
                                   14/2012 9:58:26 AM' the copy of 'Mailbox Database 0520255' on this server experie
                                   nced an error that requires it be reseeded. For more detail about this failure, cons
                                   ult the Event log on the server for other storage and "ExchangeStoreDb" events. The
                                   passive database copy has been suspended.

SinglePageRestore                : 0
ContentIndexState                : Failed
ContentIndexErrorMessage         : Catalog is dismounted externally for database {806ec30f-6a7c-4794-8e9a-f9b8e7c74651}
                                   .
CopyQueueLength                  : 20599
ReplayQueueLength                : 1
LatestAvailableLogTime           : 8/14/2012 9:53:40 AM
LastCopyNotificationedLogTime    : 8/14/2012 9:53:40 AM
LastCopiedLogTime                : 8/8/2012 12:36:57 PM
LastInspectedLogTime             : 8/8/2012 12:36:57 PM
LastReplayedLogTime              :
LastLogGenerated                 : 2585399
LastLogCopyNotified              : 2585223
LastLogCopied                    : 2564800
LastLogInspected                 : 2564800
LastLogReplayed                  : 2564799
LogsReplayedSinceInstanceStart   : 0
LogsCopiedSinceInstanceStart     : 0
LatestFullBackupTime             :
LatestIncrementalBackupTime      :
LatestDifferentialBackupTime     :
LatestCopyBackupTime             :
SnapshotBackup                   :
SnapshotLatestFullBackup         :
SnapshotLatestIncrementalBackup  :
SnapshotLatestDifferentialBackup :
SnapshotLatestCopyBackup         :
LogReplayQueueIncreasing         : False
LogCopyQueueIncreasing           : False
OutstandingDumpsterRequests      : {}
OutgoingConnections              :
IncomingLogCopyingNetwork        :
SeedingNetwork                   :
ActiveCopy                       : False
ASKER CERTIFIED SOLUTION
Avatar of Manpreet SIngh Khatra
Manpreet SIngh Khatra
Flag of India image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
How would I know how much space is required in the log drive in order for the process to complete correctly?

We arent running backups on the exchange..
You need to check the size of logs and DB on the Mounted server and see that same location and size should be available. Rememeber Location is very sensitive !!

Do you have circular logging ? If not how do log files get removed and not sure why Exchange Database backups arent done ?

- Rancy
Yes, we use circular logging, but we had to disable it in order to add the server to the DAG. Our path location is the same on both servers and we have enough space on both as well for the DB. I think the problem might be that our exchange server's EDB is 625GB in size.

I found this article to manually do the seeding process.
http://geekswithblogs.net/marcde/archive/2011/11/03/dags-and-reseeding.aspx

The instructions are towards the bottom of the article. Can you tell me if its OK to do this while my passive server has a copy queue length of 2593413 at the moment and its suspended?

Thank you!
Database ..... humm its very huge and by the way of log files it seems none are being deleted on the Active server .... isnt good !!

Can you preferably paste the details you want to highlight with your query please .... as maybe you want to understand something and i pick something and explain it :)

- Rancy
Ok, so i'm in healthy status for the recovered exchange server now as far as database copy is concerned. This is what I did.:

1.Suspend the database copy
2.Go to the passive node and remove all the database and log files (fun yet?)
3.Dismount the database from Exchange
4.Go to the log files folder on the active node and move them all to a different folder
5.Now copy the EDB file from the active node to the passive node
6.Mount the database once this is completed
7.Resume the storage group copy
8.Drink cocktails on the beach as your sync is healthy (not required but highly recommended)
All in all this should get your copy back in order. Not exactly the way you'd want to (aka without down time) but it get's the job done.

Now, when I try to set the recovered exchange server as the active mailbox, it gives me the following Content Index State message of: Crawling and does not allow me to move the active store.

Is this a normal condition? How can I fix this in order to be able to move the active store?

Thank you!
Did you remove the database catalog file\folder as well ??
Is the Copy in Healthy status ?
8. Liked it !! :)

Check all Microsoft Exchange Services are running on the Secondary server ?

- Rancy
No, I didnt remove the catalog folder from the good server. When I check the status on the source server, its Healthy. The recovered server is the one showing Crawlying status.

I did a reseed using the update -catalogonly option and it completed good, but the status on the recovered server still shows Crawling.

Yes, all MS services are running on both servers correctly.
Try to restart the Microsoft Search service.

- Rancy
Actually, i left it alone for about an hour or two and it went healthy on its own. I guess i just needed patience... I really appreciate the support during this time..

Thank you..
Look ideally the indexing is done on the Server with that catalog folder so maybe it just tried to take time to re-index after you manually removed those files from the active server .... and also update the same onto the passive node :)

- Rancy