[Okta Webinar] Learn how to a build a cloud-first strategyRegister Now

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 328
  • Last Modified:

Suggestions for addressing failed DC

Hello all, I do need some immediate help.  I ran a Windows Update recently which has hosed a number of my servers, including my 2nd in line DC, which also happens to be my RADIUS server.  Basically, it was so bad this morning, I had to bring back an image from 27 days ago, most recent image of this DC.  Now, I'm trying to replicate DC's and am receiving the error noted below when using AD Sites and Services.

Can I simply demote this DC, then bring it back by promoting it?
Any guidance is appreciated.
Thank you.

- Larry
the following error occured during the attempt to synchronize naming context from domain controller to domain controller: The replication operation encountered a database error. This operation will not continue.

Open in new window

0
LarrySND
Asked:
LarrySND
  • 10
  • 7
1 Solution
 
Jon WinterburnCommented:
Yes, you can dcpromo it to demote it and then promote it. But before you do, you must ensure any FSMO roles that are assigned to the DC are moved across to a healthy DC. The same applies for the Global Catalog. If you need help with doing this, let me know.
0
 
LarrySNDAuthor Commented:
Jon,
Thank you.  Yes, I will need help with this.  I do not believe this effected DC carries any FSMO roles, but how can I check?  The first DC built, I believe, is healthy and carries FSMO.  I don't know or how you would push FSMO out to other DC's unless they inherit automatically.  FYI, since this error has occured, I've taken the effected DC offline, powered down.
0
 
Jon WinterburnCommented:
With regards to checking and transferring (if required) the FSMO roles, look at these links - all 3 have proved useful to me in the past:

http://support.microsoft.com/kb/234790

http://www.petri.co.il/determining_fsmo_role_holders.htm

http://www.computerperformance.co.uk/w2k3/W2K3_FSMO_transfer.htm

If the healthy DC was the first in the domain and the emulated PDC, then it most likely holds all FSMO roles. It also most likely holds the Global Catalog. The fact the dodgy DC is powered down and you have no problems is indicative that it's simply an emulated BDC.

I would suggest powering it up, check their are no FSMO roles assigned to it, demote it using dcpromo, leave it off for a few days and check event logs in the good DC to ensure no errors occur. If all is well, then reinstall Windows, reconnect it to domain and then promote it using dcpromo. This is what I have had to do twice in the past due to hard drive failures (once with the BDC and once with the PDC - the latter was a bit scary as it held all FSMO roles, but it all went well).
0
Veeam Disaster Recovery in Microsoft Azure

Veeam PN for Microsoft Azure is a FREE solution designed to simplify and automate the setup of a DR site in Microsoft Azure using lightweight software-defined networking. It reduces the complexity of VPN deployments and is designed for businesses of ALL sizes.

 
LarrySNDAuthor Commented:
Will do.  I will wait a few days after I get started to bring it back.

Just to clarify,  when you say "reinstall Windows", do you mean as in run a repair, or flatten the HDD's and reinstall from scratch, basically rebuilding the machine and introducing it as a new machine to the envirement?

If that is the case, then I will have to create a new WSUS server as well.

Thank you so very much for your help Jon.

0
 
Jon WinterburnCommented:
No worries - I know how daunting it can be when it comes to DC's and I've learnt from my experiences, and the help I've received on Experts Exchange. And you're right to be cautious because my experience has been that when you rush into these things without first considering everything, you only end up with headaches! Active Directory is wonderful when all is working fine, but can be a real nightmare when things go wrong!

You don't have to rebuild or repair if it's going to cause a problem. I just rebuild whenever I've had issues, out of habit and because I use RIS images it only takes about half an hour.

Providing the DC demote is completely successful (checking event logs plus running netdiag on the dodgy DC and dcdiag and netdiag on the good DC will show if it has been successful), then there's no real need to blat and rebuild - especially if it's going to cause you hassle like setting up WSUS again.

For me, it made sense as I had no other services on the box except DHCP which was simple to move.
0
 
Jon WinterburnCommented:
By the way, if it helps, these are the questions I asked of the experts when I had the problems on my Primary DC and had to move FSMO roles etc etc. All good advice and it got me through the demotion/promotion and taught me alot about AD, FSMO etc. I hope they help.

http://www.experts-exchange.com/Software/Server_Software/File_Servers/Active_Directory/Q_23952878.html#a23085280

http://www.experts-exchange.com/Software/Server_Software/File_Servers/Active_Directory/Q_23953441.html

http://www.experts-exchange.com/Software/Server_Software/File_Servers/Active_Directory/Q_23953441.html
0
 
LarrySNDAuthor Commented:
Awesome, thank you for all the info Jon.  I will keep you posted as to our progress.
0
 
LarrySNDAuthor Commented:
Jon, real quick, after demoting the effected DC, is there an AD Cleanup I should perform or is that the purpose of leaving it off for a couple days?  Thank you.
0
 
LarrySNDAuthor Commented:
Jon, 2 things:
Ran Netdiag and it failed on Trust Relationship test, Kerberos test and WAN configuration test

Ran DCPROMO on bad DC and it will not allow me to authenticate to good a 3rd healthy DC in a remote location.  I receive "The operation failed because: Active Directory could not transfer the remaining data inthe directory partition CN,etc... to domain controller C.domain.int. "Access is denied" "  Then keeps prompting me to supply a username with Enterprise Admin privleges to the forest.

Can I redirect this authentication to the local, healthy DC that holds the FSMO roles, not the remote DC that doesnt?  I'm assuming that is the issue.
0
 
LarrySNDAuthor Commented:
FYI, on the Trust Relationship Failure, it reads "FATAL Secure channel to domain "domain" is broken. [ERROR_ACCESS-DENIED]
0
 
Jon WinterburnCommented:
Okay, it looks like because of the problem your bad DC has had, it cannot demote gracefully. You can use the switch /forceremoval (as in dcpromo /forceremoval) which will forcibly remove AD from the machine so it's no longer a DC. As no FSMO roles are on the bad DC and I assume the Global Catalog is on the good DC (use this link to check: http://www.petri.co.il/configure_a_new_global_catalog.htm), then I don't see any reason why forceremoval shouldn't solve the problem. As for clearing up - that is what dcpromo is supposed to do, but as we have seen, this is not always the case.

Check out the workaround in this kb for non-graceful demotions:

http://support.microsoft.com/kb/332199
0
 
LarrySNDAuthor Commented:
Thank you Jon.

After the force demotion, would you suggest this as well?  http://support.microsoft.com/?kbid=216498
If so, before or after leaving it off for a few days?

Thanks again Jon.
0
 
Jon WinterburnCommented:
That looks like a good suggestion, yes. I would suggest doing it 24 hours after the forced demotion (leaving the bad DC running all the while), then after doing that, leave the bad DC running for another 48-72 hours. After this time, switch it off and leave it off for another 48 hours. If no errors occur in the event logs of the good DC and no weirdness occurs, then I would say it's good to go.
0
 
LarrySNDAuthor Commented:
lol well kind of funny you should mention that...  Actually, I kind of did it the other way around.. After the forced DC Demotion, I shut it off completely, letting it sit over the weekend.  There were connection errors on my healthy DC's trying to find it, so I performed the AD/DNS clean up to the best of my ability.  All logs look great now, no record of the DC ever existing anywhere and no errors so far.

My plan is to let it go for one more day, bring the effected machine back up, unplugged from the network, rename it, reboot, join it, reboot, then promote it under its new ID.

Any thoughts?
0
 
Jon WinterburnCommented:
Hehe - this is it - there is no hard and fast rules to this kind of thing, often it's best to go with your gut instinct, once you have all the information you need.

Sounds good - no suggestions really - it should all be fine.
0
 
LarrySNDAuthor Commented:
Okay, thanks Jon.  Hopefully in a day or so, we'll complete this project.  I'll keep you posted as to our progress here.

Thank you so much again for pointing me in the right direction and keeping me on track.
0
 
LarrySNDAuthor Commented:
Jon, thanks again for your your detailed help.  I'm not an AD expert, so I truly do appreciate the guidance.  Thanks again and have a good Cinco de Mayo!

- Larry
0

Featured Post

Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

  • 10
  • 7
Tackle projects and never again get stuck behind a technical roadblock.
Join Now