DC replication and failure question

Hello Experts,

I want to test DC site failure between 2 sites connected via a leased line. Both sites have 2 dcs each on Win server 2008. Both sites can communicate with each through our leased line.
Both sites have their own DNS and DHCP servers.  

The plan is to unplug both DC on one site and see if those on the other site take over and vice versa. This will simulate a complete DC site failure.

As I understand we will have to move our fsmo's role over to the active dcs when the others are taken down and also DHCP and DNS. If we did have a complete site DC failure on site1 I can I setup a DHCP and DNS in a disabled state in site2 ready to take over in case of an emergency.
wcgplcAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

LearnctxEngineerCommented:
You don't really have to move FSMO roles unless the partner will be down for some time. Whatever you do, don't turn off the DC in a site and seize its FSMO roles while its offline, unless you plan on performing a metadata cleanup and rebuilding the OS. Don't stick your DHCP, CA, or any other services to be honest on a DC. They're disposable assets in the AD world, you don't want something critical and hard to replace running on a server designed to be replaced on a whim.

Hopefully you're running DHCP separately from your DC's? DC's are DC's, they should not be used for other purposes. For DHCP look at DHCP high availability (either load balanced or fail over mode) and set up 1 DHCP server at each site. This will prevent any DHCP downtime.

AD isn't going to care if it can't talk to its partners for even long periods of times (weeks and even months) as long as the length of time does not exceed the tombstone lifetime period. As long as you don't leave it offline too long, when you power it back up or reconnect it to the network the 2 DC's will catch up like old friends and discuss all the replications.
0
wcgplcAuthor Commented:
Thanks for the Info @learnctx. The main purpose of the exercise is to combat a complete DC site failure on one site so the failed DC's would have to be rebuilt from scratch. Yes, I have DHCP setup on one DC from each site. Ill definitely be looking at "DHCP high availability".  
Should I also have DNS not on a DC?

Below are the site configs:

Site 1:
1 x DC(A) with AD services, DNS, DHCP. DHCP only allocates addresses to our Windows 10 machines from the subnet 12.13.2.x.
1 x DC(B) with AD services and DNS

Site 2:
1 x DC(C) with AD services, DNS, DHCP. DHCP only allocates addresses to our Windows 10 machines from the subnet 12.14.2.x.
1 x DC(D) with AD services and DNS
0
footechCommented:
It's nice not to have DHCP on your DCs, but it's not that uncommon and won't cause a problem.  DHCP HA since Server 2012 is pretty nice.  DNS is almost always on your DCs as it's the only way to take advantage of AD-integrated DNS zones.

A site in AD isn't really a functional unit, meaning a site itself can't fail (at least, I can't think how it could).  You could have a network connection, VM host, or what-have-you fail which results in all DCs in a site being offline, but the site itself didn't fail.  The reason I mention this is I'm wondering what kind of failure(s) are you trying to protect against?  You mentioned both the DCs, but why would this happen?  Are both DCs VMs on the same host?  If so, you could split them up on different hosts.  That's the easiest solution.

If both DCs (w/ DNS) are gone, would you reconfigure all clients to use different DNS servers?
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
wcgplcAuthor Commented:
Sorry by "site" I meant physical offices. Thanks for the info!
0
footechCommented:
So something like:
 - a meteor crashes into the city where site 2 is, taking out the office including the DCs and the client machines (so you don't have to worry about maintaining function for the clients).

In this case everything in site1 should continue humming along.  If any FSMO roles were running on one of the now-destroyed DCs in site2 then the roles would have to be seized by a DC in site1.  Metadata cleanup should be performed for the site2 DCs.


Alternative scenario:
 - DCs at site2 are kept in a room all by themselves.  A hole in the roof over that room allows water through during a storm and the hardware running the DCs is destroyed, however all other equipment in the office is fine, leaving you with a bunch of clients at site2 that need to still do their work.

If they need to make use of resources on AD, all site2 clients would need to be reconfigured to use a different DNS server that has your AD records until a new DC is stood up in site2 (and then reconfigured again to use the new DC in site2 once it's available).  If any FSMO roles were running on one of the now-destroyed DCs in site2 then the roles would have to be seized by a DC in site1.  Metadata cleanup should be performed for the site2 DCs.


Alternative scenario 2:
 - DCs at site2 are VMs running on the same hardware.  The motherboard on that machine goes kaput.  A replacement is sent but won't be there for a few days, but as soon as it arrives (in this scenario), the machine will be able to be powered on and all VMs will function fine.  All other equipment in the office is fine, leaving you with a bunch of clients at site2 that need to still do their work.

If they need to make use of resources on AD, all site2 clients would need to be reconfigured to use a different DNS server that has your AD records until the DCs in site2 come back online (and then reconfigured again to use the DCs in site2).  When the DCs come back online they happily resume their relationship (as long as it hasn't been longer than the tombstone period).


Alternative scenario 3
 - the leased line becomes non-functional for a period of time

Except for direct communication between site1 and site2, everything keeps on working.  Perhaps you would set up a site-to-site VPN if an internet connection is still available until the private line is restored.  When communication between sites is restored, DCs happily resume their relationship (as long as it hasn't been longer than the tombstone period).

BTW - default tombstone lifetime is 60 days if your domain was built with Server 2003 (pre-SP1) or earlier, 180 days if built later.  If any doubt, you have to check the right AD attribute - if it's not set then it's 60 days.

P.S.  Learnctx should also be awarded points.
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
DHCP

From novice to tech pro — start learning today.