Disaster Recovery test - Question about domain controllers

Hi all,
In a couple of weeks we'll be testing our disaster recovery plan, and I have a question about domain controllers

First some info on how we're setup:
-3 sites, each as 2 DC's in it, one domain across all 3 sites.
-The 3 sites linked with VPN tunnels
-IP's in the office are setup so one office uses 10.1.x.x, another 10.2.x.x, another 10.3.x.x
-Each office has Equallogic SAN being used in iSCSI mode, connecting to vsphere
-The SAN's replicate nightly. Site 1 and 3 replicate to site#2, site#2 replicates to site#3.
-Each office has a 3 server vSphere essentials plus farm
-All three offices have virtual servers doing file/print, WSUS, antivirus, and a few others
-The main site, lets call it site#1, also hosts a Citrix farm and all our main shared apps like microsoft dynamics and others.

So our plain is that we'll basically just unplug the firewall in site#1 on a saturday morning, simulating that office burning down (to the outside world). We'll update public DNS entries of things we want to test to point to site#2. In site#2, we'll then promote the replicated volumes on the Equallogics to full volumes (in test mode so we can revert back without changing those when we're done). On the firewall in Site#2 we'll remove reference to site#1 in the VPN settings so it's not trying to route to the VPN for site#1's internal IP's anymore.

Then we'll connect those Equallogic volumes to the vSphere environment in site#2 and connect the VM's. I've already created vSwitches for the subnets in use (and the vLAN's on the procurve switches), which we'll connect the VM's to.

update firewall in site#3 so vpn looking for site#1 now points to site#2. The firewall in site#2 already has all the NAT rules and the like setup, just disabled until needed.

At that point, I think we'll be ready to fire up the servers. The main things we want to test is to see if our core apps come up, and if staff can get at them from Citrix and basically have no idea we are running from a different site. All their shared drives should be present, and other than physical things like printing or phones, be able to do just about any of their normal work.

I have a couple questions though.
1)Anything glaring that I am missing?
2)For domain controllers, should we start the virtual DC's from site#1 when we do this, or just let the machines from site#1 connect to site#2's DC's. I am concerned if I do start up the DC's from site#1, then when we shut down the test, the "real" DC's back in site#1 will be out of sync.
Who is Participating?

Improve company productivity with a Business Account.Sign Up

HarsemConnect With a Mentor Commented:

sounds like a pretty good plan.

In regards to 2) I would not worry, as your test is over within 1 -2 days. What you are doing (in essence) is to restore a domain controller from backup that is 1 or 2 days out of synch. Microsoft has a default value of 60 days for which a Domain Controller can be disconnected from the network. So 1 or 2 days would not be an issue.

To check that value above please go to:
to check what this value is for your AD Forest.

GreenEnvyAuthor Commented:
As a followup, this test went well.

We shutdown the firewall in site #1, leaving all the servers running there but inaccessible to the outside world.

In site#2, we then promoted the Equallogic replicated volumes to full volumes. We imported the machines we wanted to test into our vSphere.
I had to manually go put some IP's into our Procurve switches in Site#2 so they would know how to route the traffic for the servers from Site#1. Also modified our watchguards to not try to route traffic for site#1's ip ranges over the VPN.

Turned on the servers from site#1, other than the DC's from Site#1. We just let them connect to the site#2 DC's. Had to go manually change the DNS server IP's for the servers from site#1 (though we could have also just added a second virtual NIC to the site#2 DC's with the corresponding IP, but didn't want to mess with the "production" equipment).

Those servers got connectivity right away. We had changed the DNS for our citrix servers the night before, so we tested that from external computers and it was up and running right away. Tested our file servers, SQL, finance apps, all OK.

All in all, it only took a couple hours to failover. We had 2 staff go out to site#2 for this, but next time we test, or in a real situation, we can do it remotely and it should take less than an hour. Most of the time was going through the 30 or so servers and updating DNS.

Since then I've actually done this on a smaller scale one time when our primary backups failed one night and a user desperately needed a folder back that had been created the day before. Did this type of failover for that file server only (from the replication of the volumes) and got the folder back.
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.