Windows 2003 DC failover

I have a single domain Windows 2003 network with redundant domain controllers.  if the first DC in the domain were to die irreparably  would one of the remaining DC's fully take over and which one would it be - the second one installed, assuming I have three, let's say? If the DC that died just did DC - no other services - would anybody notice? Would I have to do something to the other DC's?  If it did have services such as DNS or DHCP would I have to have had the other servers having 'secondary' versions of them and then again if the 'first' one died would the secondary ones take over so that nobody would notice? Or would I have to do some additional work? The basic question is how transparent is DC redundancy? Is there ever a need to try to restore one - same name/same function - if I  have other DC's - or can I just blow it away?
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Joseph DalyCommented:
The answer to your question is maybe.

If your DC held the FSMO roles for your domain you would need to seize them to another working DC and then perform a metadata cleanup on your AD environment.

You should be replicating DNS to all DCs in your domain so that if one fails you will still be able to resolve DNS.

If the failed DC held the DHCP role you would need to add DHCP server role to another server before clients would be able to get IP addresses. You may have some issues between old and new DHCP leases if you did this.

MS does reccomend an 80/20 DHCP scope between two servers if you want some kind of DHCP redundancy.

Any other questions let me know
Mike KlineCommented:
You would have to make sure that the clients have the IP of the other DCs/DNS servers in their configuration.

Also make sure the other boxes are global catalogs.  

You will want to read on dc stickiness

If your DC1 holds all the FSMO roles and it crashes hard you will have to seize the roles and cleanup the dead DC (search metadata cleanup)

Clients should continue to work ok


Bottom line here....

Multiple DCs are not for the purpose of transparent Failover.

They are for the purpose of Ad Database Redundancy so that you do not loose the Database.  But if a DC goes down (any DC anywhere) there is likely going to be at least some amount of disruption somewhere.
Has Powershell sent you back into the Stone Age?

If managing Active Directory using Windows Powershell® is making you feel like you stepped back in time, you are not alone.  For nearly 20 years, AD admins around the world have used one tool for day-to-day AD management: Hyena. Discover why.

Mike KlineCommented:
I've seen a lot of environments that have over done things and have many DCs in their hub sites and one of those goes down many times there is no disruption.


Leon FesterSenior Solutions ArchitectCommented:
DC failovers are pretty much redundant, except when you've got hard coded references to your DC's.
DNS for example is associated with the IP address of a DC, so if that DC fails then any IP references will fail.
Similarly WINS would also be affected.
DHCP however, if built on th 80/20 or any split ratio system will continue to service DHCP requests.
Authentication has a built-in algorhythm to find the next available domain controller it first DC is not responding.

Similarly with the FSMO roles; it is explicitly set to a specific DC.  
If that DC falls over then you've lost that role until you either restore the DC or seize the role.
It's not an automated process, so manual intervention is required.

Regarding restoring one with the same name/functions:
It's not too important in your environment since you've already got 3 DC's.
What might be helpful is to restore/rebuild a DC on the same IP address to avoid reconfiguration of DHCP/DNS/WINS or any static mappings in applications.

In a multi-domain controller environment, it's often faster, cleaner and simpler to complete remove a failed DC and rebuilding it then running a DCPROMO.
All you're doing is adding a new DC to an existing domain structure.
To mkline71:

That's true.
It usually means that there wasn't anything actually using the one that went down.  DNS Clients always try to use the same DNS they were able to use the "last time",  Exchange will always try to use the same CG it used last time as well even if all of the DCs are also GCs.   So it means that nothing had previously been using the one that went down, nothing "missed" it not being there.  

But I've have never been that lucky,...things always actup if any of my DCs go down..
lineonecorpAuthor Commented:
Thanks for the 'storm' of answers. One point that keeps coming up - 'seize the FSMO role'. What is involved in doing that if the original DC is dead? And what is the time critical aspect of it? What happens when a domain doesn't have FSMO for a few hours/days? I thought the whole point of having multiple DC's was that things chugged along when the 'primary' went down - what doesn't chug when the DC holding the FSMO role disappears? Is there any way to set up the domain so there is a 'secondary' FSMO server?
Leon FesterSenior Solutions ArchitectCommented:
There are 5 FSMO roles, and only one of each per domain. So no, you cannot setup a 'secondary' fsmo role holder.

Of the 5 roles, the PDCe role is used most on your domain.

Have a read throught these two links for more understanding about FSMO roles.
Recommended from this blogs ...
Lee W, MVPTechnology and Business Process AdvisorCommented:
First, a question:

Why haven't you tested this in your environment?  Or AT LEAST a test environment?  It's INSANELY EASY to test - pull the network cable out a DC - there - it's failed.  Now how does the rest of the network work?  

We can tell you what we THINK will happen, but we don't know the intricacies of your environment and we're not looking at them beyond what you're choosing to share and we're remembering to ask.

IN THEORY, you can lose a DC and not have any noticeable problem until and unless you try to add another DC.  It has been my experience that if there's a DC that's unreachable, at least in a site, then promoting a new DC can fail.  Admittedly, the last time I saw that was in Windows 2000's AD, but I suspect it largely holds true today.  Even if it doesn't you WANT to clean things up when a failure happens and not later*.

If you don't understand the FSMO roles then you should not be the person responsible for maintaining Active Directory (sorry to be blunt, but that's how I feel).  If you're trying to learn, great, but the way I'm reading the question, it seems you are the responsible party.  I also want to be clear - I'm not suggesting you don't have an ability to learn and excel, but if these are things you don't understand, you should be building test networks, taking classes, watching videos, and learning on a network that is NOT running a business.

The FSMO roles handle critical functions and coordinate things between all DCs.  For example, every user and computer has a Security ID/Globally Unique ID (SID/GUID).  This is assigned by the DC that creates the user or computer.  These IDs MUST be unique on the network or you'd have serious problems.  The RID master (Relative ID master) allocates blocks of IDs to each DC.  This way, each DC has it's own unique set of IDs to assign to accounts created on that DC that will not conflict with other DCs.  When the IDs get low, the DC asks the RID master for more IDs. So what happens if the DC designated as the RID master is down for a day?  Unless you're a business with thousands of employees adding dozens of new ones per day with their computers, probably nothing.  But when that supply decreases to 0 (and in a large business that could be days while in a small business, that could be MONTHS or even YEARS), you won't be able to create new accounts anymore UNTIL the RID master is restored.  

If the RID master is LOST - as in the DC that it was on fails and cannot be restored easily, then you must seize the roles - Seizure should ONLY be done once you are CERTAIN the role holder will never come back online.  And if you're not certain, once it's seized, the policy MUST be that the failed DC will NEVER be restored even if you suddenly figure out it was an easy fix.  When a role is seized, depending on the role, things happen to ensure it doesn't mess up the network with duplicate or bad information.  In the case of the RID master, it dramatically increases the RID count so the odds of actually handing out a duplicate block of RIDs is astronomically low.

I strongly recommend going to your management and demanding (as much as you can demand) a class in Active Directory and a reasonably powerful machine you can use for VMs to setup a test network to play with.  And for more information on the FSMO master roles, see:

Some services (like Exchange) may be less resilient to DC failures.  Exchange relies on a Global Catalog (GC) server and if the server it's using becomes unavailable, it's possible that Exchange will have connectivity issues for a while.  Typically, within 30 minutes Exchange will figure out there's a problem and look for another GC, but there can be a disruption.

Personally, I recommend for any site without a strongly knowledgeable AD admin, a minimum and maximum of 2 DCs, both of which should be Global Catalogs and DNS servers.

For DHCP, I'd probably do a 50/50 split scope or a 33/33/33 split scope amongst three servers, but no more than 3 servers.

Finally, once you're done, especially if you haven't before, I recommend running DCDIAG on the existing DCs (I usually use the /c /e /v switches) to verify the health of AD and then address any errors that need addressing (some may be normal or even expected in certain environments).

BTW, I'm sure once you learn it, you'll be a fine AD admin... but no one should ever learn in a production environment and learning through a few questions on AD here is not sufficient in my opinion - there are BOOKS on this stuff that aren't complete.
Suliman Abu KharroubIT Consultant Commented:
For FSMO and its impact in the active directory if one of them lost, please read this article ( the best one I read ever about FSMO):

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Well said Lee!  Particularly on on having more than one but less than 3 DCs and on the DHCP Scope Splitting.
lineonecorpAuthor Commented:
dvt_localboy, Sulimanw: Thanks for the links. They were quite useful.

leew:  Thanks for the lengthy explanation. As to why I don't just pull out the plug - well, I think all the answers on here are the reason - seeing that things aren't working is different than knowing why they might not be working. If I pulled out the cable and things didn't work  I would just end up starting the question with 'My DC went down' and then asking why - and then pretty well getting all the answers you and the others so readily provided insight into. As far as reading/watching videos/etc. - you should not assume that I haven't as it seems you are. I have done all that - but there is a difference between thinking you understand and actually understanding.  I think I know what I read but the only way I can get confirmation my understanding is correct is posing scenarios and  asking others what they think will happen in that scenario.  And of course, not everything is covered in manuals/lectures - - there are all kinds of undocumented 'unexpected's/ insider tips e.g. your 'two DC's and not more than and not less than'.  Experts is good for getting people to tell you from the trenches what the official material leaves out/what field experience dictates. This repartee here has sharpened my knowledge more than any additional reading or pulling cables could have - I can now intelligently pull the cable knowing if I do have problems whether they are par for the course or something just weird to my circumstances and any stuff I read from this point on will be read with a lot more context and certainty. So all in all, as far as this question goes for my needs and style of learning, Experts worked as perfectly as always.
lineonecorpAuthor Commented:
Good lively comprehensive discussion.
Leon FesterSenior Solutions ArchitectCommented:
OMG! Finally somebody who uses EE for learning!
This makes all the effort of sharing and typing in these longs discussions so much more worthwhile.
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Windows Server 2008

From novice to tech pro — start learning today.