Link to home
Start Free TrialLog in
Avatar of numb3rs1x
numb3rs1xFlag for United States of America

asked on

Domain controller hardware failure

I wanted to field this question out to those who might have done this before. I have a Windows Server 2003 install that was the initial domain controller for a given domain. It has a second 2003 Server that is also a domain controller. A couple of weeks ago, the DC, we'll call it 01, had some kind of hardware failure that either has to do with the RAID controller or the drives, or possibly something else. Whatever the problem, I can't get it to boot and I'm pretty sure it is not recoverable. I put a new disk in it and I want to try re-installing Windows 2003 on it and adding it to the domain so that it is again a failover pair of Domain Controllers. If the hardware will not allow me to do that, I will use different hardware. I want to know what the recommended practice is for this procedure and what might be the possible dangers.
ASKER CERTIFIED SOLUTION
Avatar of Mike Kline
Mike Kline
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of numb3rs1x

ASKER

I should also mention that the purpose of these domain controllers is for a MSSQL 2000 Cluster. Is there any difference because of that?
With the failure of the first DC in an AD domain, deals with the transfer of the FSMO and schema roles to the remaining DC as the new master.
mkline71 links point to that.
ntdsutil will be used to Seize the FSMO roles and make the remaining DC the schema master and GC. When the other server is rebuilt/reinstalled you can add it into the domain as a DC.
Should I seize the FSMO roles before or after I remove the railed DC from AD?
I should have proofed that before sending it. I meant the "failed" DC, not the "railed" DC.
The DC is failed the only way to remove it from the AD is to sieze the roles. Then go through the process of cleaning the AD up or alternatively rebuild the system and reinstall the OS and rejoin the system into the AD as another DC. Under no circumstances restore the system state on this system from backup or all hell will break loose.


http://support.microsoft.com/kb/216498

If your plan is to restore this system to a functional state anew, I would not bother with the cleanup.  When you rejoin the system into the AD as another DC, you can reuse the same account without an issue.
I will definitely have to reinstall Windows on the original DC. There are only two DC's and the primary one is out of commission. I was going to seize the FSMO roles onto the remaining DC. Is there a risk to functionality? This DC is the only server up right now that is handling the MS SQL 2000 Cluster and I don't want to interrupt services during business hours if there is a risk of doing that when seizing these roles.
I decided to wait until after hours to seize the roles. It seemed to go just fine.

I have a question about this part:

If your plan is to restore this system to a functional state anew, I would not bother with the cleanup.  When you rejoin the system into the AD as another DC, you can reuse the same account without an issue.

Just so I'm clear, if I re-install Windows 2003 onto the same hardware and call it the exact same thing as the old DC, the domain will allow me to promote it without my going through the steps to clean out the old DC?

Another possibility that I'm thinking seriously about is installing Windows 2003 on a new server entirely. Would it also be ok to name it the same thing as the DC that failed and then promote it to the domain without doing the metadata clean?
When you join a system into a domain where the same name already exists you will be prompted on whether you wish to reuse the existing record.
Yes.
A cleanup is at times needed when the roles are seized and the prior master server is never recreated.
Thank you for your comments. I am finished the initial build of the w2k3 server. Do I just need to join it to the domain now and then run the dcpromo?
Did you already do the metadata cleanup.  Once the remnants of the old dead box are gone yes you can join and promote.
Arnold said:
Then go through the process of cleaning the AD up or alternatively rebuild the system and reinstall the OS and rejoin the system into the AD as another DC.
and:
When you join a system into a domain where the same name already exists you will be prompted on whether you wish to reuse the existing record.
Yes.
A cleanup is at times needed when the roles are seized and the prior master server is never recreated.



Does this mean that I still have to do a metadata cleanup even if I'm going to use the name again? This is what I was confused about in the first place. The way I understood it, you only have to seize the roles but not do the cleanup if you are going to replace the server.
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
I did the metadata cleanup and then I was able to successfully join the VM I made with the same name as the dead server to the domain. I ran a dcpromo on it then and it installed AD without a problem. I was never prompted about keeping the existing information about the old server. How do I tell if everything is good to go now?
Good Work!  

Check the event logs, run repadmin /showreps on that box to make sure replication is ok.  You can also use dcdiag to check the health of the DC.

Thanks

Mike
The prompt occurs when you add a system that already has an account. Did you delete the account for the old system from the AD during your cleanup? If so, you would not have been prompted since the computer account was deleted.
I didn't delete the account. The only two things I did was seizing the FSMO roles to the other DC and then the metadata clean. I then built the VM, gave it the same name as the dead one, joined it to the domain and then promoted it. I ran the replication and dc diagnostics, and they came back clean, so I think it's good, I was just curious as to why that might have happened.
My mistake. This is the same question. I did not see the replies at first.