Link to home
Start Free TrialLog in
Avatar of bajicd
bajicd

asked on

W2k3 NTDS Replication error - event id 1864

I had one site with three domains: parent.net (two DCs: SRV01.parent.net and SRV02.parent.net), child1.parent.net (one DC: SRv05.child1.parent.net) and child2.parent.net (one DC: SRv06.child2.parent.net). DNS is AD integrated and all DCs use same DNS servers. For some licensing reasons I had to reinstall both child DCs. This is how I did it: promoted additional temporary DC in child1, denoted SRV05, reinstaled SRV05 with same name, promoted SRV05, denoted temporary DC. Everything went fine without any errors. I did the same thing in child2 except last step so now I have two DCs in child2: reinstalled SRV06 and temporary FLSRV02 (I have not denoted it yet). Since then, once a day I have 5 Replication errors in Event Logs on SRV01 and SRV02, for example:

Event Type:      Error
Event Source:      NTDS Replication
Event Category:      Replication
Event ID:      1864
Date:            8/22/2006
Time:            8:10:31 PM
User:            NT AUTHORITY\ANONYMOUS LOGON
Computer:      SRV02
Description:
This is the replication status for the following directory partition on the local domain controller.
 Directory partition:
CN=Configuration,DC=parent,DC=net
 The local domain controller has not recently received replication information from a number of domain controllers.   The count of domain controllers is shown, divided into the following intervals.
 More than 24 hours:
2
More than a week:
2
More than one month:
0
...

I did some checking and I found that on SRV01 and SRV02 there are wrong (old?) GUIDs for child DCs:

SRV01 and SRV02:

ldap_search_s(ld, "CN=Configuration,DC=parent,DC=net", 2, "(cn=NTDS Settings)", attrList,  0, &msg)
Result <0>: (null)
Matched DNs:
Getting 4 entries:
>> Dn: CN=NTDS Settings,CN=SRV02,CN=Servers,CN=Default-First-Site-Name,CN=Sites,CN=Configuration,DC=parent,DC=net
      1> objectGUID: bedba399-1d79-4a74-b42c-79c89ab2b874;
>> Dn: CN=NTDS Settings,CN=SRV05,CN=Servers,CN=Default-First-Site-Name,CN=Sites,CN=Configuration,DC=parent,DC=net
      1> objectGUID: d33014f7-53c7-4024-b2a3-5e7907c5d6e3;
>> Dn: CN=NTDS Settings,CN=SRV06,CN=Servers,CN=Default-First-Site-Name,CN=Sites,CN=Configuration,DC=parent,DC=net
      1> objectGUID: c35dbfec-a148-419a-8ba7-2aefce7511e3;
>> Dn: CN=NTDS Settings,CN=SRV01,CN=Servers,CN=Default-First-Site-Name,CN=Sites,CN=Configuration,DC=parent,DC=net
      1> objectGUID: 85d55fef-d81d-42c6-930c-d235e9efddc5;

all child DCs:

>> Dn: CN=NTDS Settings,CN=SRV01,CN=Servers,CN=Default-First-Site-Name,CN=Sites,CN=Configuration,DC=parent,DC=net
      1> objectGUID: 85d55fef-d81d-42c6-930c-d235e9efddc5;
>> Dn: CN=NTDS Settings,CN=SRV02,CN=Servers,CN=Default-First-Site-Name,CN=Sites,CN=Configuration,DC=parent,DC=net
      1> objectGUID: bedba399-1d79-4a74-b42c-79c89ab2b874;
>> Dn: CN=NTDS Settings,CN=SRV05,CN=Servers,CN=Default-First-Site-Name,CN=Sites,CN=Configuration,DC=parent,DC=net
      1> objectGUID: 73cda4e6-75a9-4f4a-9e3b-d6b982c6e711;
>> Dn: CN=NTDS Settings,CN=FLSRV02,CN=Servers,CN=Default-First-Site-Name,CN=Sites,CN=Configuration,DC=parent,DC=net
      1> objectGUID: d5c45fd0-9914-4936-b837-18f806334a15;
>> Dn: CN=NTDS Settings,CN=SRV06,CN=Servers,CN=Default-First-Site-Name,CN=Sites,CN=Configuration,DC=parent,DC=net
      1> objectGUID: 968acae7-6e9b-4db7-b542-c67bf6a54c1b;

_msdcs zone in DNS has same GUIDs as child DCs.

How can I fix this problem? Should I manually add records in DNS for old GUIDs so SRV01 and SRV02 can find replication partners or should I edit data in AD database on SRV01 and SRV02. Or maybe something else?

Any advice appreciated.
Dusan
Avatar of darkeryu
darkeryu

Hi:

you can delete the _msdcs zone ,then restart the netlogon service,
but you need ensure the dns can auto update unsecure type.

thanks

Avatar of bajicd

ASKER

HI darkeryu,

Tried that already, but it is of no use anyway because records in _msdcs zone are accurate.
Avatar of John Gates, CISSP, CDPSE
Under Active Directory Sites And Services > Server > NTDS Settings right click> All Tasks> Check Replication Topology then check the error logs for KCC events and let us know.  PS Don't delete any records from anything, or you could cause more problems.  Also, do you see any event log messages about failures on the out of date DCs?

-D-
Avatar of bajicd

ASKER

Hello Dimante,

I did like you said, but there are no any new errors in Event log. But  I checked few days before and I found warnings on SRV01 and SRV02 when it was restarted yesterday:

Event Type:      Warning
Event Source:      NTDS KCC
Event Category:      Knowledge Consistency Checker
Event ID:      1308
Date:            8/22/2006
Time:            7:04:55 PM
User:            NT AUTHORITY\ANONYMOUS LOGON
Computer:      SRV01
Description:
The Knowledge Consistency Checker (KCC) has detected that successive attempts to replicate with the following domain controller has consistently failed.
 Attempts:
129
Domain controller:
CN=NTDS Settings,CN=SRV05,CN=Servers,CN=Default-First-Site-Name,CN=Sites,CN=Configuration,DC=konsing,DC=net
Period of time (minutes):
7181
 The Connection object for this domain controller will be ignored, and a new temporary connection will be established to ensure that replication continues. Once replication with this domain controller resumes, the temporary connection will be removed.
 Additional Data
Error value:
8452 The naming context is in the process of being removed or is not replicated from the specified server.
For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.

There is also similar error about SRV06:  

Attempts:
158
Domain controller:
CN=NTDS Settings,CN=SRV06,CN=Servers,CN=Default-First-Site-Name,CN=Sites,CN=Configuration,DC=konsing,DC=net
Period of time (minutes):
5927
 


Has anything changed on these servers like service packs etc?
Avatar of bajicd

ASKER

Old child DCs (SRV05 and SRV06) were plain w2k3, reinstalled are W2k3 SP1
What are the links between the sites?

Avatar of bajicd

ASKER

All three domains are at the same site.
All domain controllers are on the same switch, pings work fine, DNS resolving is working (SRV01 and SRV02 are DNS servers for all computers in forest), users are not experiencing any problems. This is not very dynamic system (couple of new users or password resets in a week, on average), but I believe at some point I might encounter problems if I leave it this way.
Please install the support tools and run dcdiag and net diag on the problem dcs.  Post it here....  Somehow we are missing something.
Avatar of bajicd

ASKER

What do you think about different GUIDs I mentioned in my first post. I am quite sure that is causing replication problems.

I ran dcdiag /s:srv01 /c /f:c:\dcdiagreport.txt. Nothing new there, all test passed except lots of errors about replication from SRv05 and SRV06, for example:

[Replications Check,SRV01] A recent replication attempt failed:
            From SRV05 to SRV01
            Naming Context: CN=Schema,CN=Configuration,DC=parent,DC=net
            The replication generated an error (8452):
            The naming context is in the process of being removed or is not replicated from the specified server.
            The failure occurred at 2006-08-24 10:45:00.
            The last success occurred at 2006-08-17 18:59:38.
            160 failures have occurred since the last success.

Netdiag shows all test passed.
well I have one suggestion then.  You should demote both servers (provided that you still have the secondaries in place)  Then use ntdsutil and perform a metadata cleanup to get rid of the old names.  Then promote the servers to DCs again.  That way you are starting from a clean slate.  Make sure all FSMO roles are transferred to the secondaries before demoting the servers.  If you forgot that step to begin with you will have to sieze the FSMO roles to the secondaries:

http://support.microsoft.com/kb/255504/


-D-
Avatar of bajicd

ASKER

Thanks Dimante.

Actually, my plan (even before this occured) was to move all the users and computers from child domains to parent domain (create new users in parent domain), delete child domains and get rid of the trouble :). There is no real need for 3 domains in such small environment as mine (about 150 users alltogether, all at the same location). But before that, on saturday I will backup all my DCs and play a little :). I will let you know what happened...
Ok, Yeah I agree I have over 3,500 users in my environment in two bulidings over WAN links and I only have 1 domain, so I agree with you.  Let me know how you fare 8)

-D-
Have these servers had windows reinstalled?
Avatar of bajicd

ASKER

yes
Avatar of bajicd

ASKER

Problem solved:
First I manualy deleted connections in ADS&S that did not work. Then I was left with errors in event log about lingering objects. This document helped (event log message also offers few tips):
http://technet2.microsoft.com/WindowsServer/en/library/4a1f420d-25d6-417c-9d8b-6e22f472ef3c1033.mspx?mfr=true
This text also has useful information:
http://www.phptr.com/content/images/0131467581/downloads/ReplicationFailsAfterReplicaDCPromotion.txt
ASKER CERTIFIED SOLUTION
Avatar of Netminder
Netminder

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial