Link to home
Start Free TrialLog in
Avatar of cendrizzi
cendrizzi

asked on

Domain controller is only replicating one way.

I have a remote DC that is not syncing AD both ways.  Any changes made to the directory structure from my main office get synchronized correctly.  Any changes made to the remote office never reach the main office.

Under the sites and services management tool there is a connection made for this to take place but when I try and manually replicate it shows the following error:

<start of dialog>
The following error occurred during the attempt synchronize naming context "XXXXX.local" from the domain controller LINDON-DC:
The naming context is in the process of being removed or is not replicated from the specified server.

This operation will not continue.
<end of dialog>

I have no idea what this means or how long it has not worked (we often just make changes from the main office).  I tried looking for any kind of a solution and couldn't find a thing.

Thanks a lot in advance.
Avatar of LauraEHunterMVP
LauraEHunterMVP
Flag of United States of America image

You should run the following diagnostic tools on both DCs:

netdiag /v
dcdiag /v
repadmin /replsum

Depending on how long the "problem" DC has not been replicating, you may need to forcibly demote it and clean it out of AD before re-promoting it.  (Unless you installed a pristine AD using 2003 R2 media, the tombstone lifetime is 60 days; if you've built a brand-new R2 server the lifetime is 180 days.)

Generally speaking, 99% of replication issues can be traced back to one or both of the following:

[1] Physical connectivity - is there a firewall between the DCs that is interfering with their communications?

[2] Name resolution - can each DC successfully resolve the other's FQDN?

Hope this helps.

Laura E. Hunter - Microsoft MVP: Windows Server - Networking
Avatar of cendrizzi
cendrizzi

ASKER

Yes that is helpful.  With it being able to replicate one way would the tombstone lifetime still expire?

It is definitely 60 days.  I actually have been using some of those and it hasn't told me much.  I will do it the way you say.  

Thanks.
Yes, the tombstone lifetime will still cause an issue if you haven't been able to replicate in one direction.  If your problem child DC has not replicated in over 60 days, you need to remove that DC from your environment.  Leaving a non-replicating DC online past this tombstone lifetime has also placed your environment at risk for USN rollback; use the following KB article to determine whether your AD environment has experienced USN rollback or not: http://support.microsoft.com/kb/875495
Ok, running dcdiag revealed that a tombstone lifetime error has occurred.  This is strange since connectivity has not been an issue.  I will check on the USN Rollback.  This all does not look good.
"Tombstone lifetime error" usually == "USN rollback" in my experience.  Follow the steps in the KB article to determine whether USN rollback has actually occured, as well as to take the steps necessary to recover from it.
Wow oh wow.  I know I've been busy with software development and a haven't been checking things like I should but I am pretty sure my entire AD environment is screwed up.  The following is the output of repadmin /showutdvec for each DC.

C:\Program Files\Support Tools>repadmin /showutdvec win2003dc dc=xxxxxxx,dc=local

Caching GUIDs.

..

SITE2\DC-SITE2                       @ USN    470467 @ Time 2007-04-02 12:49:34

79604ae5-9a50-4605-a1bb-4193dd260af9 @ USN     66440 @ Time 2006-05-14 22:30:52

SITE4\DC-SITE4                       @ USN  16374953 @ Time 2007-04-02 13:40:24

SITE1\PM-SERVER                      @ USN   1638391 @ Time 2007-04-02 13:52:03

SITE1\WIN2003DC                      @ USN    262834 @ Time 2007-04-02 13:55:21

SITE3\SITE3-DC                       @ USN     84715 @ Time 2005-11-14 08:59:33





C:\Program Files\Support Tools>repadmin /showutdvec pm-server dc=xxxxxxx,dc=local

Caching GUIDs.

..

SITE2\DC-SITE2                       @ USN    470467 @ Time 2007-04-02 12:49:34

79604ae5-9a50-4605-a1bb-4193dd260af9 @ USN     66440 @ Time 2006-05-14 22:30:52

SITE4\DC-SITE4                       @ USN  16374953 @ Time 2007-04-02 13:40:24

SITE1\PM-SERVER                      @ USN   1638609 @ Time 2007-04-02 14:27:06

SITE1\WIN2003DC                      @ USN    263006 @ Time 2007-04-02 14:26:57

SITE3\SITE3-DC                       @ USN     84715 @ Time 2005-11-14 08:59:33





C:\Program Files\Support Tools>repadmin /showutdvec dc-SITE4 dc=xxxxxxx,dc=local

Caching GUIDs.

..

SITE2\DC-SITE2                       @ USN    470467 @ Time 2007-04-02 12:49:34

79604ae5-9a50-4605-a1bb-4193dd260af9 @ USN     66440 @ Time 2006-05-14 22:30:52

SITE4\DC-SITE4                       @ USN  16374958 @ Time 2007-04-02 14:28:02

SITE1\PM-SERVER                      @ USN   1638053 @ Time 2007-04-02 12:59:12

SITE1\WIN2003DC                      @ USN    262481 @ Time 2007-04-02 12:59:26

SITE3\SITE3-DC                       @ USN     84715 @ Time 2005-11-14 08:59:33





C:\Program Files\Support Tools>repadmin /showutdvec dc-SITE2 dc=xxxxxxx,dc=local

Caching GUIDs.

..

SITE2\DC-SITE2                       @ USN    470555 @ Time 2007-04-02 14:28:22

79604ae5-9a50-4605-a1bb-4193dd260af9 @ USN     66440 @ Time 2006-05-14 22:30:52

SITE4\DC-SITE4                       @ USN  16374921 @ Time 2007-04-02 12:46:03

SITE1\PM-SERVER                      @ USN   1638024 @ Time 2007-04-02 12:54:42

SITE1\WIN2003DC                      @ USN    262458 @ Time 2007-04-02 12:54:41

SITE3\SITE3-DC                       @ USN     84715 @ Time 2005-11-14 08:59:33





C:\Program Files\Support Tools>repadmin /showutdvec SITE3-dc dc=xxxxxxx,dc=local

Caching GUIDs.

..

SITE2\DC-SITE2                       @ USN    470467 @ Time 2007-04-02 12:49:34

79604ae5-9a50-4605-a1bb-4193dd260af9 @ USN     66440 @ Time 2006-05-14 22:30:52

SITE4\DC-SITE4                       @ USN  16374921 @ Time 2007-04-02 12:46:03

SITE1\PM-SERVER                      @ USN   1638035 @ Time 2007-04-02 12:56:25

SITE1\WIN2003DC                      @ USN    262472 @ Time 2007-04-02 12:57:49

SITE3\SITE3-DC                       @ USN   1087310 @ Time 2007-04-02 14:28:31




Every DC has different number in some way and therefore, I'm guessing, that every DC needs to be redone.  Am I reading this correctly?

I have a basic question.  All the DCs have to work through the main dc win2003dc.  I can just have them syncing to it right?  THe routers aren't setup to let them talk to each other so this is the only way this can be done.  I'm just trying to figure out where things went so wrong!

Thanks
Also I have no idea what the 79604ae5-9a50-4605-a1bb-4193dd260af9 machine is.  That looks like the GUID for a machine or somthing.  How do I get rid of that?
So in my case I just need to demote all the other servers besides my main one (win2003dc) to be a basic stand alone server.  Then I need to re-promote them to be a DC and it should replicate the latest info.

Can I do this while they are each on different subnets?
ASKER CERTIFIED SOLUTION
Avatar of LauraEHunterMVP
LauraEHunterMVP
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
That's a great help.  

I do have the subnets setup properly and it sounds like most stuff is working right in light of what you've said.

I tried to remove the metadata for that strange 79604ae5-9a50-4605-a1bb-4193dd260af9 server

It wouldn't let me.  Do I get rid of that server in the same way?

That server is very strange because it doesn't appear to have a site associated with it...
Are netdiag/dcdiag/repadmin returning successfully now?  

Where are you still seeing the GUID that you're referencing?
This server as shown above when running repadmin /showutdvec:
79604ae5-9a50-4605-a1bb-4193dd260af9 @ USN     66440 @ Time 2006-05-14 22:30:52

I have no idea what that is referencing.  I was trying to get rid of this first before I demoted the troubled server.  It appears to me to be some strange server that is only showing by it's GUID that is in AD somehow.

Am I not reading that correctly?  It appears to me that I have two different problems from that output:
1)  This -> 79604ae5-9a50-4605-a1bb-4193dd260af9 @ USN     66440 @ Time 2006-05-14 22:30:52
2)  SITE3\SITE3-DC                       @ USN     84715 @ Time 2005-11-14 08:59:33

The second (SITE3-DC) is actually the server that is not replicating correctly and why I started this question.

So for problem one do I just remove the metadata in the same fashion?  Because that is what I tried and had no luck...
Actually, it's probably not fair for me to lop on that other server to this question as it is not a part of this.  If you have ideas for removing it then let me know but I'll try and concentrate on the SITE2-DC (which is actually LINDON-DC, I changed the name).
To remove that other domain controller I will probably have to wait until tomorrow morning when I can get someone at the office.  I'm sure that there are computer/user accounts that will be lost once I demote it and resync with my main DC.  I know there is a manual way to create computer accounts but I'm not sure how to and having someone there will ensure we can get the new people that have been setup on that network working ASAP.
If the GUID is being described as a "retired invocation" then you don't need to worry about it, it's just showing an old GUID that used to be involved in replication that isn't anymore.

If I were you I would concentrate on first resolving the issue of the non-replicating DC, which it sounds like you're well on your way to doing. Once that's working properly, allow your AD to go along for a week or 3 and then re-run the various utilities to see if that GUID is actually causing issues or just something that's cosmetic.  (The first rule of troubleshooting AD replication, after all, is "Fix one problem at a time and then -stop touching it for awhile-!  :-))

I would certainly re-iterate my recommendation that you spend a few dollars on something like MOM Express to do AD monitoring on an ongoing basis, so that you don't have to spend a week of your life fixing something else 3 months from now because you didn't notice that it wasn't quite right.
What a night!

I just learned a valuable lesson about paying attention to server names.  I accidentally removed the wrong DC from my network!  So I had to re-setup that one first, then finally get to my problematic server.  I did everything you said and everything seems fine.

I'm wondering about that "repadmin /showutdvec" output however since when I use it I see two other GUID's like before.  I'm sure these are the servers I removed so it appears to me that AD keeps a backlog of old retired machines as well (which would explain that other ,the 79604ae5-9a50-4605-a1bb-4193dd260af9 from two posts ago, because I did have to remove an AD server before).  So actually I guess I'm not wondering since that makes sense.  I will close this tomorrow if all goes well.  Thanks a lot LauraEHunter, your help has been invaluable.
I should mention that removing the wrong server seems even dumber since I didn't have to force the removal.  I thought I would try doing it the normal way just to see if it would work and it DID!  Of course that is only because it was the wrong server!  When I got to the real problem server it didn't let me because of replication (as expected).

Dang servers with similar names...
Man, I hate it when that happens.  :-)  Sounds like all is going well though, which is good!
Ok, all working great.  Thanks again!