Link to home
Start Free TrialLog in
Avatar of EnclosAdmin
EnclosAdminFlag for United States of America

asked on

AD Replication denied by Root AD server.

I have a main AD root server in an HQ office.  I have had 6 Domain Controllers in WAN sites for over a year.  Friday the AD Server just decided to stop replicating with one of those AD servers on a WAN link.

Nothing on the WAN link has changed.  We use Notes, users at the WAN site can receive Mail but they cannot send it, and that server can no longer replicate and receive the AD directory updates.  The KCC Fails due to and RPC error.

Now the other 5 DC's work just fine.  All of hte setup in the Sites and Services are the same, they haven't changed.  We are very Hub and Spoke so all DC's come directly to the ROOT server.  We will change that when we implement our MPLS network but for now they all talk directly to the ROOT.

I even tried to have the STL DC talk to one of the other DC's but they won't replicate together.

I am sure somebody has seen something similar to this in the past, at least I hope so.
Avatar of Phadke_hemant
Phadke_hemant
Flag of India image

have you tried force replication through AD site and services?
Avatar of EnclosAdmin

ASKER

Yes I have.  Through multiple sites however the other sites will not replicate with STL either.  the ROOT knows that it should replicate but RPC isn't responding to STL.  It responds to the other five DC's just fine but for some reason it tells the STL server that RPC isn't available.
Avatar of younghv
POINTER QUESTION
You might want to put a 20 point 'Pointer' question over in the "Window Server 2003" TA.
https://www.experts-exchange.com/Operating_Systems/Windows_Server_2003/

The folks over there are well-versed in this.
Just open a new post (minimum 20 points) with a title like "500 easy points" and include the URL of this question.

When your question is answered, you can request a refund of the 20 points.
Thank you younghv.

I have taken your advice and posted a pointer question.
Glad to help.
Also be aware that sometimes it is kind of slow around here on weekends.
Lots more eyes reading problems Mon-Fri.


Vic
Avatar of CharliePete00
CharliePete00

I'd start by checking the event log on the problem DC for encryption and communications related errors (ssl, file replication service, frsevent, microsoft cryprographic service, etc.)

Report back any errors.

Also log into the the problem server and check the time.  If the time is too far off it will not be able to form ssl tunnels with the other DCs and replication will fail (if you are using RPC-based links anyway)

If the time is off get the name of the DC with the PDC Emulator role and from the command-line:

Net Time /set <PDC Emulator>
I had checked the time, that was fine.

Now in sites and services it is actually set up for IP but when it fails on the DC in question it is reporting the RPC is not available.

The STL server [The one with the problem] it does report KCC errors however the root server does not report any KCC errors.  It has a single entry that reports a problem communicating with the STL DC and that it will continue to try to communicate.  That is as much as I get from the root server.
Any frsevent errors in the event log on the problem DC?
Also are you using DFS (distributed files system) with File Replication on any of your DCs?
I am not using DFS on any DC's.

The File Replication Service is having trouble enabling replication from STLDC1 to MSPROOT for c:\windows\sysvol\domain using the DNS name stldc1.enclos.glass.com. FRS will keep retrying.

That is the only File Replication Service Warning on the root DC.

On the server that is having the issue there are the same three errors and that's where it is reporting that the server that it is trying to replicate with [DC Root] isn't responding, RPC is not available.  Those occur every 20 minutes.

Keep in mind none of the other DC's are having the problem and they report no errors in their Event Logs.

Thank You
Let's try verifying that the SYSVOL and NETLOGON shares on the problem DC are good:

1.  Restart the File Replication Service and note any errors that pop up in the event log
2.  Execute the following from the command-line and note any failure:
         dcdiag /test:netlogons
3.  Execute the following from the command-line and note any failure:
         dcdiag /test:replications

Where we go from here will depend on what, if any, errors show from the above.
C:\WINDOWS>dcdiag /test:netlogons

Domain Controller Diagnosis

Performing initial setup:
   Done gathering initial info.

Doing initial required tests

   Testing server: STL\STLDC1
      Starting test: Connectivity
         ......................... STLDC1 passed test Connectivity

Doing primary tests

   Testing server: STL\STLDC1
      Starting test: NetLogons
         ......................... STLDC1 passed test NetLogons

   Running partition tests on : ForestDnsZones

   Running partition tests on : DomainDnsZones

   Running partition tests on : Schema

   Running partition tests on : Configuration

   Running partition tests on : enclos

   Running enterprise tests on : enclos.glass.com

C:\WINDOWS>dcdiag /test:replications

Domain Controller Diagnosis

Performing initial setup:
   Done gathering initial info.

Doing initial required tests

   Testing server: STL\STLDC1
      Starting test: Connectivity
         ......................... STLDC1 passed test Connectivity

Doing primary tests

   Testing server: STL\STLDC1
      Starting test: Replications
         REPLICATION-RECEIVED LATENCY WARNING
         STLDC1:  Current time is 2006-10-30 07:33:38.
            DC=ForestDnsZones,DC=enclos,DC=glass,DC=com
               Last replication recieved from YULDC1 at 2006-10-28 01:07:02.
               Last replication recieved from MSPROOT at 2006-10-28 00:56:45.
               Last replication recieved from SNADC1 at 2006-10-28 00:45:02.
               Last replication recieved from BWIDC1 at 2006-10-28 00:45:02.
               Last replication recieved from PLSDC1 at 2006-10-28 00:45:03.
               Last replication recieved from MNLDC1 at 2006-10-28 00:45:02.
               Last replication recieved from LAXDC1 at 2006-10-28 00:45:02.
            DC=DomainDnsZones,DC=enclos,DC=glass,DC=com
               Last replication recieved from YULDC1 at 2006-10-28 01:07:02.
               Last replication recieved from MSPROOT at 2006-10-28 00:56:45.
               Last replication recieved from SNADC1 at 2006-10-28 00:45:00.
               Last replication recieved from BWIDC1 at 2006-10-28 00:45:00.
               Last replication recieved from PLSDC1 at 2006-10-28 00:45:01.
               Last replication recieved from MNLDC1 at 2006-10-28 00:45:01.
               Last replication recieved from LAXDC1 at 2006-10-28 00:45:00.
            CN=Schema,CN=Configuration,DC=enclos,DC=glass,DC=com
               Last replication recieved from YULDC1 at 2006-10-28 01:06:40.
               Last replication recieved from MSPROOT at 2006-10-28 00:56:45.
               Last replication recieved from SNADC1 at 2006-10-28 00:44:58.
               Last replication recieved from BWIDC1 at 2006-10-28 00:44:58.
               Last replication recieved from PLSDC1 at 2006-10-28 00:44:58.
               Last replication recieved from MNLDC1 at 2006-10-28 00:44:57.
               Last replication recieved from LAXDC1 at 2006-10-28 00:44:58.
            CN=Configuration,DC=enclos,DC=glass,DC=com
               Last replication recieved from YULDC1 at 2006-10-28 01:06:40.
               Last replication recieved from MSPROOT at 2006-10-28 00:56:44.
               Last replication recieved from SNADC1 at 2006-10-28 00:44:53.
               Last replication recieved from BWIDC1 at 2006-10-28 00:44:55.
               Last replication recieved from PLSDC1 at 2006-10-28 00:44:57.
               Last replication recieved from MNLDC1 at 2006-10-28 00:44:56.
               Last replication recieved from LAXDC1 at 2006-10-28 00:44:54.
            DC=enclos,DC=glass,DC=com
               Last replication recieved from YULDC1 at 2006-10-28 01:06:39.
               Last replication recieved from MSPROOT at 2006-10-28 00:56:44.
               Last replication recieved from SNADC1 at 2006-10-28 00:44:59.
               Last replication recieved from BWIDC1 at 2006-10-28 00:44:59.
               Last replication recieved from PLSDC1 at 2006-10-28 00:45:00.
               Last replication recieved from MNLDC1 at 2006-10-28 00:44:51.
               Last replication recieved from LAXDC1 at 2006-10-28 00:44:59.
         REPLICATION-RECEIVED LATENCY WARNING
          Source site: CN=NTDS Site Settings,CN=BWI,CN=Sites,CN=Configuration,DC=enclos,DC=glass,DC=com
          Current time: 2006-10-30 07:33:39
          Last update time: 2006-10-28 00:18:26
          Check if source site has an elected ISTG running.
          Check replication from source site to this server.
         REPLICATION-RECEIVED LATENCY WARNING
          Source site: CN=NTDS Site Settings,CN=Enclos-HQ,CN=Sites,CN=Configuration,DC=enclos,DC=glass,DC=com
          Current time: 2006-10-30 07:33:39
          Last update time: 2006-10-28 00:34:49
          Check if source site has an elected ISTG running.
          Check replication from source site to this server.
         REPLICATION-RECEIVED LATENCY WARNING
          Source site: CN=NTDS Site Settings,CN=LAX,CN=Sites,CN=Configuration,DC=enclos,DC=glass,DC=com
          Current time: 2006-10-30 07:33:39
          Last update time: 2006-10-28 00:14:49
          Check if source site has an elected ISTG running.
          Check replication from source site to this server.
         REPLICATION-RECEIVED LATENCY WARNING
          Source site: CN=NTDS Site Settings,CN=MNL,CN=Sites,CN=Configuration,DC=enclos,DC=glass,DC=com
          Current time: 2006-10-30 07:33:39
          Last update time: 2006-10-28 00:20:08
          Check if source site has an elected ISTG running.
          Check replication from source site to this server.
         REPLICATION-RECEIVED LATENCY WARNING
          Source site: CN=NTDS Site Settings,CN=SNA,CN=Sites,CN=Configuration,DC=enclos,DC=glass,DC=com
          Current time: 2006-10-30 07:33:39
          Last update time: 2006-10-28 00:01:09
          Check if source site has an elected ISTG running.
          Check replication from source site to this server.
         REPLICATION-RECEIVED LATENCY WARNING
          Source site: CN=NTDS Site Settings,CN=YUL,CN=Sites,CN=Configuration,DC=enclos,DC=glass,DC=com
          Current time: 2006-10-30 07:33:39
          Last update time: 2006-10-28 00:47:24
          Check if source site has an elected ISTG running.
          Check replication from source site to this server.
         REPLICATION-RECEIVED LATENCY WARNING
          Source site: CN=NTDS Site Settings,CN=PLS,CN=Sites,CN=Configuration,DC=enclos,DC=glass,DC=com
          Current time: 2006-10-30 07:33:39
          Last update time: 2006-10-28 00:03:55
          Check if source site has an elected ISTG running.
          Check replication from source site to this server.
         ......................... STLDC1 passed test Replications

   Running partition tests on : ForestDnsZones

   Running partition tests on : DomainDnsZones

   Running partition tests on : Schema

   Running partition tests on : Configuration

   Running partition tests on : enclos

   Running enterprise tests on : enclos.glass.com

C:\WINDOWS>
C:\WINDOWS>nltest.exe /dsregdns
Flags: 0
Connection Status = 1311 0x51f ERROR_NO_LOGON_SERVERS
The command completed successfully
When restarting the Netlogon service on the troubled DC it retuirns lots of these errors:

The dynamic registration of the DNS record 'ForestDnsZones.enclos.glass.com. 600 IN A 172.16.101.10' failed on the following DNS server:  

DNS server IP address: 172.16.77.121
Returned Response Code (RCODE): 5
Returned Status Code: 10060  

For computers and users to locate this domain controller, this record must be registered in DNS.  

USER ACTION  
Determine what might have caused this failure, resolve the problem, and initiate registration of the DNS records by the domain controller. To determine what might have caused this failure, run DCDiag.exe. You can find this program on the Windows Server 2003 installation CD in Support\Tools\support.cab. To learn more about  DCDiag.exe, see Help and Support Center. To initiate registration of the DNS records by  this domain controller, run 'nltest.exe /dsregdns' from the command prompt on the domain  controller or restart Net Logon service. Nltest.exe is available in the Microsoft Windows  Server Resource Kit CD.
  Or, you can manually add this record to DNS, but it is not recommended.  

ADDITIONAL DATA
Error Value: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.
Here is a DCDIAG from the troubled DC:
MSPROOT, the root domain server just won't respond to the STL server;

C:\WINDOWS>dcdiag

Domain Controller Diagnosis

Performing initial setup:
   Done gathering initial info.

Doing initial required tests

   Testing server: STL\STLDC1
      Starting test: Connectivity
         ......................... STLDC1 passed test Connectivity

Doing primary tests

   Testing server: STL\STLDC1
      Starting test: Replications
         REPLICATION-RECEIVED LATENCY WARNING
         STLDC1:  Current time is 2006-10-30 09:28:11.
            DC=ForestDnsZones,DC=enclos,DC=glass,DC=com
               Last replication recieved from YULDC1 at 2006-10-28 01:07:02.
               Last replication recieved from MSPROOT at 2006-10-28 00:56:45.
               Last replication recieved from SNADC1 at 2006-10-28 00:45:02.
               Last replication recieved from BWIDC1 at 2006-10-28 00:45:02.
               Last replication recieved from PLSDC1 at 2006-10-28 00:45:03.
               Last replication recieved from MNLDC1 at 2006-10-28 00:45:02.
               Last replication recieved from LAXDC1 at 2006-10-28 00:45:02.
            DC=DomainDnsZones,DC=enclos,DC=glass,DC=com
               Last replication recieved from YULDC1 at 2006-10-28 01:07:02.
               Last replication recieved from MSPROOT at 2006-10-28 00:56:45.
               Last replication recieved from SNADC1 at 2006-10-28 00:45:00.
               Last replication recieved from BWIDC1 at 2006-10-28 00:45:00.
               Last replication recieved from PLSDC1 at 2006-10-28 00:45:01.
               Last replication recieved from MNLDC1 at 2006-10-28 00:45:01.
               Last replication recieved from LAXDC1 at 2006-10-28 00:45:00.
            CN=Schema,CN=Configuration,DC=enclos,DC=glass,DC=com
               Last replication recieved from YULDC1 at 2006-10-28 01:06:40.
               Last replication recieved from MSPROOT at 2006-10-28 00:56:45.
               Last replication recieved from SNADC1 at 2006-10-28 00:44:58.
               Last replication recieved from BWIDC1 at 2006-10-28 00:44:58.
               Last replication recieved from PLSDC1 at 2006-10-28 00:44:58.
               Last replication recieved from MNLDC1 at 2006-10-28 00:44:57.
               Last replication recieved from LAXDC1 at 2006-10-28 00:44:58.
            CN=Configuration,DC=enclos,DC=glass,DC=com
               Last replication recieved from YULDC1 at 2006-10-28 01:06:40.
               Last replication recieved from MSPROOT at 2006-10-28 00:56:44.
               Last replication recieved from SNADC1 at 2006-10-28 00:44:53.
               Last replication recieved from BWIDC1 at 2006-10-28 00:44:55.
               Last replication recieved from PLSDC1 at 2006-10-28 00:44:57.
               Last replication recieved from MNLDC1 at 2006-10-28 00:44:56.
               Last replication recieved from LAXDC1 at 2006-10-28 00:44:54.
            DC=enclos,DC=glass,DC=com
               Last replication recieved from YULDC1 at 2006-10-28 01:06:39.
               Last replication recieved from MSPROOT at 2006-10-28 00:56:44.
               Last replication recieved from SNADC1 at 2006-10-28 00:44:59.
               Last replication recieved from BWIDC1 at 2006-10-28 00:44:59.
               Last replication recieved from PLSDC1 at 2006-10-28 00:45:00.
               Last replication recieved from MNLDC1 at 2006-10-28 00:44:51.
               Last replication recieved from LAXDC1 at 2006-10-28 00:44:59.
         REPLICATION-RECEIVED LATENCY WARNING
          Source site: CN=NTDS Site Settings,CN=BWI,CN=Sites,CN=Configuration,DC=enclos,DC=glass,DC=com
          Current time: 2006-10-30 09:28:11
          Last update time: 2006-10-28 00:18:26
          Check if source site has an elected ISTG running.
          Check replication from source site to this server.
         REPLICATION-RECEIVED LATENCY WARNING
          Source site: CN=NTDS Site Settings,CN=Enclos-HQ,CN=Sites,CN=Configuration,DC=enclos,DC=glass,DC=com
          Current time: 2006-10-30 09:28:11
          Last update time: 2006-10-28 00:34:49
          Check if source site has an elected ISTG running.
          Check replication from source site to this server.
         REPLICATION-RECEIVED LATENCY WARNING
          Source site: CN=NTDS Site Settings,CN=LAX,CN=Sites,CN=Configuration,DC=enclos,DC=glass,DC=com
          Current time: 2006-10-30 09:28:11
          Last update time: 2006-10-28 00:14:49
          Check if source site has an elected ISTG running.
          Check replication from source site to this server.
         REPLICATION-RECEIVED LATENCY WARNING
          Source site: CN=NTDS Site Settings,CN=MNL,CN=Sites,CN=Configuration,DC=enclos,DC=glass,DC=com
          Current time: 2006-10-30 09:28:11
          Last update time: 2006-10-28 00:20:08
          Check if source site has an elected ISTG running.
          Check replication from source site to this server.
         REPLICATION-RECEIVED LATENCY WARNING
          Source site: CN=NTDS Site Settings,CN=SNA,CN=Sites,CN=Configuration,DC=enclos,DC=glass,DC=com
          Current time: 2006-10-30 09:28:11
          Last update time: 2006-10-28 00:01:09
          Check if source site has an elected ISTG running.
          Check replication from source site to this server.
         REPLICATION-RECEIVED LATENCY WARNING
          Source site: CN=NTDS Site Settings,CN=YUL,CN=Sites,CN=Configuration,DC=enclos,DC=glass,DC=com
          Current time: 2006-10-30 09:28:11
          Last update time: 2006-10-28 00:47:24
          Check if source site has an elected ISTG running.
          Check replication from source site to this server.
         REPLICATION-RECEIVED LATENCY WARNING
          Source site: CN=NTDS Site Settings,CN=PLS,CN=Sites,CN=Configuration,DC=enclos,DC=glass,DC=com
          Current time: 2006-10-30 09:28:11
          Last update time: 2006-10-28 00:03:55
          Check if source site has an elected ISTG running.
          Check replication from source site to this server.
         ......................... STLDC1 passed test Replications
      Starting test: NCSecDesc
         ......................... STLDC1 passed test NCSecDesc
      Starting test: NetLogons
         ......................... STLDC1 passed test NetLogons
      Starting test: Advertising
         ......................... STLDC1 passed test Advertising
      Starting test: KnowsOfRoleHolders
         [MSPROOT] DsBindWithSpnEx() failed with error 1722,
         The RPC server is unavailable..
         Warning: MSPROOT is the Schema Owner, but is not responding to DS RPC Bind.
         [MSPROOT] LDAP search failed with error 58,
         The specified server cannot perform the requested operation..
         Warning: MSPROOT is the Schema Owner, but is not responding to LDAP Bind.
         Warning: MSPROOT is the Domain Owner, but is not responding to DS RPC Bind.
         Warning: MSPROOT is the Domain Owner, but is not responding to LDAP Bind.
         Warning: MSPROOT is the PDC Owner, but is not responding to DS RPC Bind.
         Warning: MSPROOT is the PDC Owner, but is not responding to LDAP Bind.
         Warning: MSPROOT is the Rid Owner, but is not responding to DS RPC Bind.
         Warning: MSPROOT is the Rid Owner, but is not responding to LDAP Bind.
         Warning: MSPROOT is the Infrastructure Update Owner, but is not responding to DS RPC Bind.
         Warning: MSPROOT is the Infrastructure Update Owner, but is not responding to LDAP Bind.
         ......................... STLDC1 failed test KnowsOfRoleHolders
      Starting test: RidManager
         ......................... STLDC1 failed test RidManager
      Starting test: MachineAccount
         ......................... STLDC1 passed test MachineAccount
      Starting test: Services
         ......................... STLDC1 passed test Services
      Starting test: ObjectsReplicated
         ......................... STLDC1 passed test ObjectsReplicated
      Starting test: frssysvol
         ......................... STLDC1 passed test frssysvol
      Starting test: frsevent
         There are warning or error events within the last 24 hours after the SYSVOL has been shared.  Failing SYSVOL
         replication problems may cause Group Policy problems.
         ......................... STLDC1 failed test frsevent
      Starting test: kccevent
         ......................... STLDC1 passed test kccevent
      Starting test: systemlog
         ......................... STLDC1 passed test systemlog
      Starting test: VerifyReferences
         ......................... STLDC1 passed test VerifyReferences

   Running partition tests on : ForestDnsZones
      Starting test: CrossRefValidation
         ......................... ForestDnsZones passed test CrossRefValidation
      Starting test: CheckSDRefDom
         ......................... ForestDnsZones passed test CheckSDRefDom

   Running partition tests on : DomainDnsZones
      Starting test: CrossRefValidation
         ......................... DomainDnsZones passed test CrossRefValidation
      Starting test: CheckSDRefDom
         ......................... DomainDnsZones passed test CheckSDRefDom

   Running partition tests on : Schema
      Starting test: CrossRefValidation
         ......................... Schema passed test CrossRefValidation
      Starting test: CheckSDRefDom
         ......................... Schema passed test CheckSDRefDom

   Running partition tests on : Configuration
      Starting test: CrossRefValidation
         ......................... Configuration passed test CrossRefValidation
      Starting test: CheckSDRefDom
         ......................... Configuration passed test CheckSDRefDom

   Running partition tests on : enclos
      Starting test: CrossRefValidation
         ......................... enclos passed test CrossRefValidation
      Starting test: CheckSDRefDom
         ......................... enclos passed test CheckSDRefDom

   Running enterprise tests on : enclos.glass.com
      Starting test: Intersite
         ......................... enclos.glass.com passed test Intersite
      Starting test: FsmoCheck
         ......................... enclos.glass.com passed test FsmoCheck

C:\WINDOWS>
Good...SYSVOL/NETLOGON are in good shape...This leaves us comm-based errors

Have you recently applied SP1 or Security update MS05-019 to the troubled DC?  Or made any changes to the NIC or NIC drivers?

If so see:

http://support.microsoft.com/?id=898060
There have been no changes to the NIC drivers.
SP1 was applied a long time ago.  All DC's have SP1.

This DC gets the Critical Updates applied to it as to the other DC's in the domain.

How can I tell if MS05-019 has been applied to the server?
MS05-019 was part of SP1

Any changes to your routers, vpns, etc can also affect this...take a look at http://support.microsoft.com/?id=898060 and apply update http://www.microsoft.com/technet/security/bulletin/MS06-007.mspx
Troubled server has been demoted and kicked out of the domain into a workgroup.

DNS has been removed from troubled server.

Server can no longer be added as a DC because it fails with an RPC error.  All Hotfixes above have been applied to the server.  Still will not talk to DC's, fails with RPC errors.

I am starting to think it's just dead and wil need to be blown away completely and re-built.

I am really hating Microsoft right about now and I generally defend them, but this has cost hours of lost time and days of very limited use for a product that has been working fine for over 8 months and at 11:55 PM Thursday night deceded to corrupt istelf someway.  AS stated before,  this site is very much like all the others, same routers, same frame relay connectivity, same OS, same hardware but it can't speak with the root DC and the other 5 are just fine.
This problem has been solved.

It was a  corrupted Riverbed Steelhead.  It caused just enough of a transmission problem that RPC couldn't resolve to the troubled server.

I bypassed the Steelhead and I was able to add the server back into the Domain, DCPromo it back to a DC.

So if you use Riverbed Steelheads be watchful.  If the RED light comes on, that means the Steelhead has a problem.  Do NOT trust the passthru switch!  Either get it fixed or bypass it until it can be fixed.

I have these on 3 of my DC connections, when they are working they work great!  Now I know what to look for if my DC starts falling apart.

Thanks to all of those that tried helping!
Thanks for the update and the info.

Good Luck!
ASKER CERTIFIED SOLUTION
Avatar of kodiakbear
kodiakbear

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial