Solved

Multi-site AD Replication Issues

Posted on 2013-01-28
12
880 Views
Last Modified: 2013-02-03
Background.

DC holding all fsmo roles were seized to a new server.  Metadata cleanup was performed, with the exception of one entry.  The server would not delete from ADSS due to "Access denied" and "Insufficient privileges to delete site".  User account was Enterprise admin, and would not  delete with ADSS or ADSIEdit.

Due to this condition, I changed the new server to the original name of the failed DC.  I transferred the pdc role successfully.  The other four roles were on another server, at the same site.

There are a total of nine sites including the main hub.  The main hub hosts the fsmo roles, and acts as the primary DNS zones.  All the other 8 sites are connected via vpn tunnels.

There are two remaining remote sites, which were never able to replicate from the main hub.  They still "think" the old servers are holding all the fsmo roles.  therefor, they are failing dcdiag tests KnownsOfRoleHolders.  These attributes known as fsmoroleowners, cannot be modified on the bad DCs via ADSIEDIT or LDP.

These servers also are not replicating the dns primary AD integrated zones.  They had old zone data, and I deleted them, and rebuilt the primary dns zone from scratch.

This caused the majority of the domain controllers to replicate perfectly.  Two stubborn ones I have not figured out how to fix.


I'm pretty sure this issue is being caused by multiple factors, based on dcdiag, netdiag, any many other tools. I need to get the dns zone to replicate to the two bad DCs.

I also believe there to be KCC inconsistinces (based on event log errors), such as "kerberos client received a KRB_AP_ERR_MODIFIED error".  

In ADSS or repadmin /replicate, I receive the error "Naming context is in the process of being removed..."

How do I get the kerberos ticket issues resolved?  I've done some basic klist tickets and stuff, but need further guidance.  I've also reset machine accounts with netdom with some degree of success.

I'm positive this is largely DNS.  So, how to get DNS as a primary zone, using the current AD integrated zone?  There could even be other copies of the same zone stored in AD, how do I purge these?

I'm also receiving some "The default SPN registration ... is missing" warnings on netdiags.

I want to resolve these issues without demoting/re-promoting.  These servers will not demote cleanly, and there has to be a better way than to go that route. The 60 day tombstone lifetime has not been met.

Just some more info to save you guys some time, the dcdiag's on the role holder DCs come back clean.  It's just these two that are being a pain (both 2003).  The two bad DC's give me lots of KCC errors, and DNS issues, but it's mostly because they have not replicated.

How do I get these servers to pull the replication data from the main hub???  I even made the bad servers secondary copies of the primary zone, but they still will not replicate.

One last note, there are some SPN errors with Netdiag, there is the possibility of duplicate or missing spn records in dns.  I've dabbled with setspn to list, and attempt to add spn that are reported missing, but could user further guidance here as well if necessary.
0
Comment
Question by:DynamicQuest
  • 7
  • 3
  • 2
12 Comments
 
LVL 6

Expert Comment

by:FdpxAP-GJL
ID: 38829972
Have you tried setting the broken DC's to use the good master as their DNS server?

see if that gets them to recognize the new settings?
0
 

Author Comment

by:DynamicQuest
ID: 38829976
Yes, I just now changed the secondary zone to primary, and AD integrated.  They are both pointing to the master.
0
 
LVL 6

Expert Comment

by:FdpxAP-GJL
ID: 38829980
The actual DNS setting on the network cards on the servers

rather than being 127.0.0.1 but the IP address  of the server as the DNS server
0
 

Author Comment

by:DynamicQuest
ID: 38829983
No, not set to loopback address.
0
 
LVL 6

Expert Comment

by:FdpxAP-GJL
ID: 38830001
But so that the faulty machines are referencing a good dns server?
0
 

Author Comment

by:DynamicQuest
ID: 38830016
Correct.  I actually backed up a copy of the primary zone, and deleted it.  I started fresh, and did ipconfig /registerdns on all the remote sites to generate the necessary records.

SO as far as I can tell, the dns server is good.  I ran dnslint and got good results.  

I'm doing some more dcdiags now, let me know if you can think of any command output that can help.

By the way, the replication error I'm receiving is "The naming context is in the process of being removed..."  which can mean a lot of things.

I appreciate the quick responses.
0
 

Expert Comment

by:JUP_IT
ID: 38830091
Well, assuming you have good DNS data in the remote sites I would try the running the following command on a DC in the 'bad' spoke site.

Firstly run...
'repadmin /KCC' which should force the domain controller to review the replication topology.

Secondly run...
repadmin /replicate <badspoke-DC-FQDN> <goodhub-DC-FQDN> CN=Configuration,DC=<YOURDOMAIN>,DC=<YOURDOMAIN>

As an example
repadmin /replicate badDC1.fred.local goodDC1.fred.local CN=Configuration,DC=fred,DC=local

It may also be worth reading this...
http://support.microsoft.com/kb/308111
0
 

Accepted Solution

by:
DynamicQuest earned 0 total points
ID: 38831763
Resolved:

For the dns issue, I created a secondary zone copy on the remote servers.  I then changed them to AD integrated primary zones.  They transferred the zone from the master.

I was then able to track down KRB_AP_ERR_MODIFIED errors in the event logs, reset the machine accounts with netdom /resetpwd and replicated ok.

Thanks for everyone's input!
0
 

Expert Comment

by:JUP_IT
ID: 38831778
You reset the failing DC computer account password?
0
 

Author Comment

by:DynamicQuest
ID: 38831781
Closing.
0
 

Author Comment

by:DynamicQuest
ID: 38831794
Yes - using these commands

net stop kdc

netdom /resetpwd /server:brokendc /userd:...

net start kdc

this resolved the The kerberos client received a KRB_AP_ERR_MODIFIED errors I was getting in the event log after fixing DNS.
0
 

Author Closing Comment

by:DynamicQuest
ID: 38848282
I was able to resolve this issue, and want to share my fix with everyone else.
0

Join & Write a Comment

You might have come across a situation when you have Exchange 2013 server in two different sites (Production and DR). After adding the Database copy in ECP console it displays Database copy status unknown for the DR exchange server. Issue is strange…
A safe way to clean winsxs folder from your windows server 2008 R2 editions
This tutorial will walk an individual through the process of configuring their Windows Server 2012 domain controller to synchronize its time with a trusted, external resource. Use Google, Bing, or other preferred search engine to locate trusted NTP …
This tutorial will walk an individual through setting the global and backup job media overwrite and protection periods in Backup Exec 2012. Log onto the Backup Exec Central Administration Server. Examine the services. If all or most of them are stop…

743 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

8 Experts available now in Live!

Get 1:1 Help Now