Want to protect your cyber security and still get fast solutions? Ask a secure question today.Go Premium

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 806
  • Last Modified:

Cannot find Schema Master

I had a mixed mode domain setup successfully with 2 NT 4 BDC,s and 2 W2K3 AD servers.  The 1st 2K3 server I setup had all 5 roles activated as well as DNS and WINS, the 2nd 2K3 server was setup and I transeferred the Infrastructure role and also setup DNS, both are setup to be GC's.  I will also be setting up an Exchange 2003 server as well.  On Friday, I ran all the utils, DCDIAG, NetDIAG, domainprep, forestprep and everything ran and passed successfully., Now, over the weekend we lost power and when I arrived this morning my 2K3 servers were at the login prompt.  My NT servers were on a UPS and never shut down.  For some reason, my 1st 2K3 server can no longer bind to itself, DCDiag fails, "not responding to directory service requests", I am getting NTDS replication errors in event viewer, I don't know what to do.  What happened during the power failure to mess up my configurations?  Please help
0
juziuu
Asked:
juziuu
  • 11
  • 11
1 Solution
 
Rant32Commented:
- Can you provide use some details about the NTDS replication events and errors on the DC? Please post all errors and warnings, there shouldn't be any.
- Is the Netlogon service started on the problem DC?
- Do you have a recent backup of the System State of the server? You might have to revert to that.
0
 
juziuuAuthor Commented:
The following is a copy of 2 different NTDS errors that appear in the event viewer.
Netlogon is started on problem DC
I do not have any backup in place as I just started setting up the 2003 servers.

On the 1st DC, it shows that the operations master is online, but from the 2nd DC, it shows ERROR listed where the server name should be.

(first error message)

This is the replication status for the following directory partition on the local domain
controller.
 
Directory partition:
DC=ForestDnsZones,DC=curtis-young,DC=com
 
The local domain controller has not received replication information from a number of
domain controllers within the configured latency interval.
 
Latency Interval (Hours):
24
Number of domain controllers in all sites:
1
Number of domain controllers in this site:
1

(second error message)

This server is the owner of the following FSMO role, but does not consider it valid. For
the partition which contains the FSMO, this server has not replicated successfully with any
of its partners since this server has been restarted. Replication errors are preventing
validation of this role.
 
Operations which require contacting a FSMO operation master will fail until this condition
is corrected.
 
FSMO Role: CN=Partitions,CN=Configuration,DC=curtis-young,DC=com
0
 
Rant32Commented:
I assume you mean events 2092 (http://www.eventid.net/display.asp?eventid=2092&eventno=5836&source=NTDS Replication&phase=1) and 1864 (http://www.eventid.net/display.asp?eventid=1864&eventno=4849&source=NTDS%20Replication&phase=1).

Make sure that both servers are pointing to the same DNS server, for now. The MS recommended configuration is that all DCs that run AD-integrated DNS zones use themselves as primary DNS server, but if you have problems with replication that won't work.

Check the time on both servers. Time synchronization (including the correct time zone) is critical in AD operation and synchronization. The W32Time service should be running at all times on all domain computers, especially on domain controllers. If the time difference is more than 4 minutes, correct the time so the difference is less than 1 minute.

If you needed to make any changes, let the servers settle (30 minutes) and see if problems go away.

If not, from AD Users & Computers on BOTH domain controllers, right-click the domain and verify the Operations Masters. There should be five FSMO role holders (Schema, Infrastructure, RID, PDC, Domain Naming) and they should be consistent. If they aren't, don't change anything yet, just note any discrepancies and report them here, please.

For clarity's sake, also list on which server you want the FSMO roles to reside and which one was your first DC (i.e. RID=SERVER1, Schema=SERVER1, PDC=SERVER2, etc).
0
What does it mean to be "Always On"?

Is your cloud always on? With an Always On cloud you won't have to worry about downtime for maintenance or software application code updates, ensuring that your bottom line isn't affected.

 
juziuuAuthor Commented:
OK.  Both servers are pointing to DC1 for DNS.  Here are the results after checking the FSMO roles.

DC1(1st server)
RID=DC1, PDC=DC1, Infrastucture=DC1, Domain Naming operations master=DC1, Schema Master (online)=DC1

DC2(2nd Server)
RID=ERROR, PDC=ERROR, Infrastructure=Error, Domain Naming operations master=DC1, Can't find schema master
0
 
Rant32Commented:
Alright, thanks.

If I understand the 1st post correctly, you've moved (or it should have moved) the Infrastructure role to DC2, right? But that's not the case.

First thing, I'd make a Systemstate backup of DC1. Then, I think you're best off if you demote DC2 to a member server, use NTDSUTIL on DC1 (after the Systemstate backup) to check any leftovers of DC2, and then promote DC2 to domain controller again.

A very complete example how to remove a failed DC using NTDSUTIL is here:
http://www.petri.co.il/delete_failed_dcs_from_ad.htm

Did I mention to do a Systemstate backup?

Any other experts agree/disagree with that?
0
 
juziuuAuthor Commented:
What could have happened during the power failure?
0
 
Rant32Commented:
A power failure by itself doesn't usually cause this kind of trouble. The NTDS is a self-recovering transaction-tracked database, and I assume you haven't placed the NTDS on a FAT partition, even though it's possible. It's unlikely that you have a damaged database, the server won't even start if the NTDS is corrupt. But, powerfailures can otherwise produce unexpected results.

I guess it depends on the state of AD at the time the error occurred and, if only one of the servers went down, the length of the shutdown. But in normal circumstances, domain controllers don't start failing unless they've been out of service for at least 60 days.

So, it's hard to tell. Maybe other error events around the time of the unexpected shutdown can shed a light on this.

Btw, also make sure that all your NT4 BDCs still agree that DC1 is the PDC emulator.
0
 
juziuuAuthor Commented:
How can I verify that the NT4 BDC's recognize that DC1 is the primary?
0
 
Rant32Commented:
Run the NT4 server manager and look at the list of servers.
0
 
juziuuAuthor Commented:
I thought so.  That lists my DC1 as Primary and DC2 as Backup along with 2 of my NT4 as Backups.
0
 
Rant32Commented:
That's fine, then.

What's your thought on re-installing AD on DC2? Does it have any other functions besides domain controller?
0
 
juziuuAuthor Commented:
I don't have a problem doing that, it is only a DC and secondary DNS server.  However, is that server my problem.  When I run NTDSUTIL on DC1, and try to connect to itself or any other DC, I get an error message that states "Binding to NJPDC ... ldap_bind_sW failed with 0x52<82 <Local Error>
0
 
Rant32Commented:
Aww, I didn't get that.

That's not good, is it? One functional domain controller that's inconsistent, and another DC that's not listening.

Maybe this can help: http://support.microsoft.com/default.aspx?scid=kb;en-us;826902
0
 
juziuuAuthor Commented:
Unfortunately that didn't help.  From the 1st DC, I can ping other PC's and DC's by IP address and machine name.  I can connect to the file server either way as well and from my client PC I can connect to the DC's.  I am confused as to where exactly my problem is, I am assuming it is on the 1st server I setup as a DC due to the fact that when I tried to install the Exchange 2003 siftware on a member server it errors that it cannot find a schema master.  This is where the confusion begins as on the 1st server it shows that the schema master is online.
0
 
Rant32Commented:
It is recommended that the Exchange setup /forestprep be run on the Schema master, locally.

http://www.msexchange.org/tutorials/Migrating-Exchange2000-Exchange-2003-Hardware.html

Doesn't really matter now. The point is, that one of your DCs thinks it's the ONLY domain controller in the domain.

Latency Interval (Hours):
24
Number of domain controllers in all sites:
1
Number of domain controllers in this site:
1
0
 
juziuuAuthor Commented:
Could I transfer the roles to the DC2 server from DC1 and wait a little and then transfer the roles back to DC1?
0
 
Rant32Commented:
I'm not convinced that moving roles back and forth will bring DC1 back to life.

At this point, I can't recommend you use either DC. Both have failed at some point.

Are there any client computers attached to the domain already, is anyone using the DCs for authentication?

If they are, which DC are they using? (LOGONSERVER environment variable, I assume DC2). If that's the case, demote DC1. Seize operation masters roles on DC2 and run NTDSUTIL metadata cleanup on DC2. Promote DC1 again.

If your AD environment is not in production yet (as you said you have just finished installing it) then I really recommend to remove all domain controller functions of BOTH DCs, and start all over.

Making backups should be a part of the installation procedure, even if the domain is not a production environment yet. Especially make backups before running Exchange setup forest/domainprep, creating or modifying large number of objects, etc. Waiting 5 minutes for a backup to complete just isn't worth the hassle.

Hope this helps.
0
 
juziuuAuthor Commented:
How can I determine which server provided authentication?  Both DC's have been in production for close to 2 weeks, so I rather not bring them both down and start over.  Is active directory that sensitive to get corrupted because of power failure?  Is this going to happen again if both DC's lose power at the same time?  These are brand new IBM eServers with 250GB HD, 1GB ram, and 3GHz processor.  I don't believe it is a hardware issue.
0
 
juziuuAuthor Commented:
Rant32,  I appreciate all the help so far.  I was able to get DC2 to take over the FSMO roles from DC1, using the MMC snap in.  This was after running DCDIAG /FIX and NETDIAG /FIX on DC1 and just making DC2 the only DNS server.  Now I want to try and utilize the NTDSUTIL tools but everytime I try and coonect to server DC2, I receive the error:

Binding to NJBDC ...
ldap_bind_sW failed with 0x52<82 <Local Error>

Any ideas?  I also notice an error when running DCDIAG /TEST:DNS on DC2, it fails the Basic test with:

Error:  No DS RPC connectivity.
0
 
Rant32Commented:
Hi, back again.

Have you restarted DC2 after seizing FSMO roles? By the way, do these domain controllers have any other function/application other than DC/DNS/DHCP ?

There isn't much to be found on the ldap_bind_sW error with ntdsutil, it's not common. One article I found points back to another EE topic:
http://www.experts-exchange.com/Operating_Systems/Windows_Server_2003/Q_21750971.html
but the solution there was te re-install AD.

<< Is active directory that sensitive to get corrupted because of power failure? >>
No, the database itself is not. But just pulling the power out of any machine can produce unpredictible results, that's not just a problem with AD.

<< Both DC's have been in production for close to 2 weeks >>
I don't want to appear confronting, but may I remind you of your second posting: "I do not have any backup in place as I just started setting up the 2003 servers"

This appears to be mutually exclusive. Which one is it?

<< I rather not bring them both down and start over >>
You don't have to start over, at least not completely. You still have NT4 domain controllers that have received all changes you've made thusfar.
One possible option is to install a fresh NT4 BDC, remove Active Directory from both 2003 Servers, promote the new BDC to PDC and run the Windows 2003 upgrade again.

You will lose the AD design and OUs, and any GPO's you've created. No domain contents.

I understand you're not looking forward to installing AD again, but the problem here is that should you be able to fix these issues, there is no telling whether either of your domain controllers is not going to run into problems sooner or later. If things like FSMO roles get messed up, then that could also mean object inconsistencies, replication problems or maybe even problems with the Exchange organization later on.

If restarting DC2 doesn't resolve the issue, and the DCs are not running other functions, then re-upgrading seems to be the fastest, and certainly the most secure way out of this.
0
 
juziuuAuthor Commented:
<< Both DC's have been in production for close to 2 weeks >>
I don't want to appear confronting, but may I remind you of your second posting: "I do not have any backup in place as I just started setting up the 2003 servers"
<< Even though they have been up for 2 weeks, I have been tweaking them on different days 'cause I need to work on other problems in the Co.  I was waiting until I got things setup accordingly, DHCP, WINS, etc before switching the tape device. >>

This appears to be mutually exclusive. Which one is it?

<< I rather not bring them both down and start over >>
You don't have to start over, at least not completely. You still have NT4 domain controllers that have received all changes you've made thusfar.
One possible option is to install a fresh NT4 BDC, remove Active Directory from both 2003 Servers, promote the new BDC to PDC and run the Windows 2003 upgrade again.
<< I know I still have the NT4 domain BDC's, I just didn't want to go through the headache of trying to install NT4 on the newer hardware, because it does not detect the floppy, NIC or larger HD.  If this is my only option then so be it.  
0
 
Rant32Commented:
I didn't mention a complete re-install of the servers ;-)

By far the easiest way to do this (and I've performed 3 domain upgrades this way) is VMware. The box is completely hardware-independent and all virtual hardware is supported by default by NT4 and Windows Server 2003. No issues upgrading whatsoever.

You can download a 30-day trial version here:
http://www.vmware.com/download/ws/eval.html
0

Featured Post

Restore individual SQL databases with ease

Veeam Explorer for Microsoft SQL Server delivers an easy-to-use, wizard-driven interface for restoring your databases from a backup. No expert SQL background required. Web interface provides a complete view of all available SQL databases to simplify the recovery of lost database

  • 11
  • 11
Tackle projects and never again get stuck behind a technical roadblock.
Join Now