Solved

Main Domain Controller down in a single domain environment

Posted on 2009-04-09
7
298 Views
Last Modified: 2012-05-06
Two days ago we had our main domain controller crash with no chance of recovery:

We have the following configuration:
Site A (Corp Office)
Site B
Site C

We have a single forest / single domain configuration

Site A: Has two domain controllers.  
The main domain controller (which crashed) held the following roles:
FSMO: Schema Owner, Domain Role Owner, PDC Role, RID Pool Master
Global Catalog

Site B: Had one domain controller which had no FSMO roles but is a Global Catalog

Site C: Had one domain controller which had Infrastucture Owner FSMO Role

Since we cannot get the main domain controller up, we plan on seizing the FSMO Roles using the following steps:
A)      Seize FSMO Roles using: http://support.microsoft.com/kb/255504 and http://www.petri.co.il/seizing_fsmo_roles.htm
B)      Remove Data/Metadata in AD using: http://support.microsoft.com/kb/216498 and
http://www.petri.co.il/delete_failed_dcs_from_ad.htm

There are two questions:
1)      After performing the steps above, will we need to manual delete any records which reference to the old main DC in DNS? (Example, Forward Lookup Zones, Domain.local  Subfolders)
2)      Also, we have an Exchange 2003 server in the domain.  The only error message we see in the event log since the main domain controller crashed is:
Source: MSExchangeAL
Category: LDAP Operation
Event: 8026
LDAP Bind was unsuccessful on directory maindc.local for distinguished name . Directory returned error [0x51] Server down
It appears this is related to the Receipient Update Services (RUS) pointing to the crashed DC. We implemented KB272552 and pointed it to a working DC.  Question, are there any other steps that we should take to ensure our Exchange server remains operational?

Any advice, suggestions, or feedback would be greatly appreciated.  Thanks

0
Comment
Question by:meade470
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
  • 2
  • 2
7 Comments
 
LVL 58

Accepted Solution

by:
tigermatt earned 250 total points
ID: 24107718

>> After performing the steps above, will we need to manual delete any records which reference to the old main DC in DNS? (Example, Forward Lookup Zones, Domain.local  Subfolders)

The cleanup process should handle all the Active Directory related records (in the _msdcs subdomain). The most you may have to do is remove the record in the main Domain.local zone named as the server's name.

>> Also, we have an Exchange 2003 server in the domain.  The only error message we see in the event log since the main domain controller crashed is:

Exchange should be configured to automatically locate another DC, and will most definitely recover as soon as the FSMO roles are seized and the failed DC object removed by a metadata cleanup. Look in Exchange System Manager at the Properties tab of the Exchange Server, on the 'Directory Access' tab. Verify the option to Automatically Discover Servers is checked. Also, verify the Exchange Server has the IP of one of the working DCs as its preferred DNS server.

The RUS may need re-locating to use another DC. Which site is the Exchange Server in?

-Matt
0
 
LVL 2

Author Comment

by:meade470
ID: 24108014
Matt:

Thank you for your feedback

1) Is there a certain amount of time we should wait after seizing the FSMO roles?  Also how long should we wait to clean the AD data/metadata.  Can all this be done in a short period of time?

2) I did verify Automatically Discover Servers is checked and that the TCP/IP settings for preferred DNS have been updated to a new DNS server (local).  We pointed the RUS to the second DC at our corp. site. Luckily we had two DC's up and running (now just one) at our corp. site.  The Exchange server is also located at the corp site.
0
 
LVL 27

Expert Comment

by:bluntTony
ID: 24108034
1) In my experience, after a metadata cleanup, you may have to manually remove  the server object out of AD Sites and services, and there are sometimes possibly a few SRV records left relating to the DC. I can't recall exactly where, but it would pay to browse the _msdcs zone and your forward lookup zone and check for any SRV records remaining that relate to the failed DC (also check for the CNAME record).

2) Apart from what matt has mentioned, I don't think your question mentioned what site your Exchange server was in. just ensure that Exchange has good access to a Global Catalog as Exchange makes heavy use of it (i.e. in the same site).

Tony
0
NEW Veeam Agent for Microsoft Windows

Backup and recover physical and cloud-based servers and workstations, as well as endpoint devices that belong to remote users. Avoid downtime and data loss quickly and easily for Windows-based physical or public cloud-based workloads!

 
LVL 27

Expert Comment

by:bluntTony
ID: 24108043
Sorry - didn't refresh - my answer were to the first two questions...
0
 
LVL 27

Assisted Solution

by:bluntTony
bluntTony earned 250 total points
ID: 24108150
In answer to your second questions, I would say...

1) When you are making changes to AD using ntdsutil, you are making changes to the database on one particular replica of it, i.e. the DC you connected to during the process. For the other servers to be aware of these changes it will take the time specified for replication to occur across all three sites for the change to be global. Intrasite replication should take no more than a few minutes, but inter-site replication I think will be governed by your site links. However you can force replication after making the changes. You can use AD Sites and Services, repadmin, or replmon to do this (the last two are part of the support tools).

2) Ensure the remaining DC in this site is a GC.
0
 
LVL 58

Expert Comment

by:tigermatt
ID: 24108191

Yep - you'll just need to make sure you allow the seizures of the 4 roles held by the failed DC to replicate around the network. You don't need to wait prior to seizing the roles or doing the metadata cleanup, though.

Regarding Exchange, if it's still playing up, do a restart of the Information Store and System Attendant services. If it's going to find a new DC, you can force it to do so by restarting those two.

-Matt
0
 
LVL 2

Author Comment

by:meade470
ID: 24118782
Thanks for all your help guys.  Yesterday afternoon we transferred FSMO roles and today we cleaned up the metadata.  So far it appears everything is going smooth.  We are going to monitor the Event Logs on the DC's and pray.
0

Featured Post

Back Up Your Microsoft Windows Server®

Back up all your Microsoft Windows Server – on-premises, in remote locations, in private and hybrid clouds. Your entire Windows Server will be backed up in one easy step with patented, block-level disk imaging. We achieve RTOs (recovery time objectives) as low as 15 seconds.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This article demonstrates probably the easiest way to configure domain-wide tier isolation within Active Directory. If you do not know tier isolation read https://technet.microsoft.com/en-us/windows-server-docs/security/securing-privileged-access/s…
Always backup Domain, SYSVOL etc.using processes according to Microsoft Best Practices. This is meant as a disaster recovery process for small environments that did not implement backup processes and did not run a secondary domain controller that ne…
This video discusses moving either the default database or any database to a new volume.
Exchange organizations may use the Journaling Agent of the Transport Service to archive messages going through Exchange. However, if the Transport Service is integrated with some email content management application (such as an antispam), the admini…

710 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question