paddyboz
asked on
DNS issue - ping successful but replication failing
Hi Experts. Would be profoundly grateful for any help on this. This is both an Exchange and windows AD question but mostly AD/DNS I suspect.
Our company encompasses four sites -
The head office where all applications of any importance are, and three others connected via VPN.
We have a single windows domain covering all four sites. There are single DCs at three of the sites which are small satellite offices and each DC at the satellite sites is also a GC, DNS and DHCP server.
There were three DCs at the main site.
The three DCs at the main site were/are a SQL server, an Exchange Server and a file and print server.
All has been working very well (particularly Exchange 2003) until.....
We had problems with our Exchange Server - the system attendant would not start, and as the problem seemed to be related in some way to communication between AD and Exchange we rashly took the decision to demote it from a DC to a member server. I know (now!) that MS does not support this however at the time it worked and the services all started ok.
Not long afterwards however things started to behave strangely. Outlook 2000 clients at remote sites hang when connecting to Exchange. Some Outlook 2003 clients do but not all it seems. Removing the outlook profile completely and recreating it seems to address the problem temporarily but it recurs. Outlook users at the head office have no issues with connection to the Exchange Server.
We are having issues with replication of DCs between the satellite sites and the main site - KCC errors 1311 and 1865, and NTDS replication errors of 1232 and 1188. These would suggest that names cannot be resolved or their is no IP connectivity.
Internal DNS name resolution is not working for clients at satellites trying to conect to intranet sites at the head office. When pinging by the same host name it works fine.
I have changed the DNS client of my DCs at the remote sites to point to a DNS server at the head office but this has not had any effect.
I have made any number of changes to sites and services to try and persuade the DCs to see each other. My understanding is that most of this is done automatically but I have nonetheless manually set bridgehead servers between the spokes and the hub.
DNS would seem to be the culprit but there is something darned strange going on that is bewildering me. DNS appears to be working fine! Everything resolves OK when pinging and all the relevant records seem to be present in the DNS servers at each site.
This is not an IP issue as far as i can see - we have never had issues with our IP connectivity and at the IP level everything seems to work as before. I have checked with portqry to see if the ports are all available and they certainly seem to be. It just seems to be an issue of name resolution that is not resolving (except when you ping!).
Any help much appreciated.
Our company encompasses four sites -
The head office where all applications of any importance are, and three others connected via VPN.
We have a single windows domain covering all four sites. There are single DCs at three of the sites which are small satellite offices and each DC at the satellite sites is also a GC, DNS and DHCP server.
There were three DCs at the main site.
The three DCs at the main site were/are a SQL server, an Exchange Server and a file and print server.
All has been working very well (particularly Exchange 2003) until.....
We had problems with our Exchange Server - the system attendant would not start, and as the problem seemed to be related in some way to communication between AD and Exchange we rashly took the decision to demote it from a DC to a member server. I know (now!) that MS does not support this however at the time it worked and the services all started ok.
Not long afterwards however things started to behave strangely. Outlook 2000 clients at remote sites hang when connecting to Exchange. Some Outlook 2003 clients do but not all it seems. Removing the outlook profile completely and recreating it seems to address the problem temporarily but it recurs. Outlook users at the head office have no issues with connection to the Exchange Server.
We are having issues with replication of DCs between the satellite sites and the main site - KCC errors 1311 and 1865, and NTDS replication errors of 1232 and 1188. These would suggest that names cannot be resolved or their is no IP connectivity.
Internal DNS name resolution is not working for clients at satellites trying to conect to intranet sites at the head office. When pinging by the same host name it works fine.
I have changed the DNS client of my DCs at the remote sites to point to a DNS server at the head office but this has not had any effect.
I have made any number of changes to sites and services to try and persuade the DCs to see each other. My understanding is that most of this is done automatically but I have nonetheless manually set bridgehead servers between the spokes and the hub.
DNS would seem to be the culprit but there is something darned strange going on that is bewildering me. DNS appears to be working fine! Everything resolves OK when pinging and all the relevant records seem to be present in the DNS servers at each site.
This is not an IP issue as far as i can see - we have never had issues with our IP connectivity and at the IP level everything seems to work as before. I have checked with portqry to see if the ports are all available and they certainly seem to be. It just seems to be an issue of name resolution that is not resolving (except when you ping!).
Any help much appreciated.
Did you try running netdiag and dcdiag on all DC's?
so nslookups do not work correctly, but pings do? for example:
nslookup dc1.domain.com fails
ping dc1.domain.com successfuly
is that what you are experiencing? if so the only thing i can think of is that you have a correct host record on the box for dc1.domain.com, since nslookups would not use this host file, but the ping would.
nslookup dc1.domain.com fails
ping dc1.domain.com successfuly
is that what you are experiencing? if so the only thing i can think of is that you have a correct host record on the box for dc1.domain.com, since nslookups would not use this host file, but the ping would.
ASKER
Hi to both. DCDIAG on the bridgehead server at the main site produces:
Domain Controller Diagnosis
Performing initial setup:
Done gathering initial info.
Doing initial required tests
Testing server: Marbella\CENTRAL-TECH01
Starting test: Connectivity
......................... CENTRAL-TECH01 passed test Connectivity
Doing primary tests
Testing server: Marbella\CENTRAL-TECH01
Starting test: Replications
[Replications Check,CENTRAL-TECH01] A recent replication attempt failed:
From OAS01 to CENTRAL-TECH01
Naming Context: CN=Schema,CN=Configuration ,DC=rmicen tral,DC=co m
The replication generated an error (1727):
Win32 Error 1727
The failure occurred at 2005-06-07 17:06:11.
The last success occurred at 2005-06-03 20:38:26.
129 failures have occurred since the last success.
[Replications Check,CENTRAL-TECH01] A recent replication attempt failed:
From VERA01 to CENTRAL-TECH01
Naming Context: CN=Schema,CN=Configuration ,DC=rmicen tral,DC=co m
The replication generated an error (1727):
Win32 Error 1727
The failure occurred at 2005-06-07 17:18:49.
The last success occurred at 2005-05-21 10:38:15.
663 failures have occurred since the last success.
[Replications Check,CENTRAL-TECH01] A recent replication attempt failed:
From HEVER-MAIN to CENTRAL-TECH01
Naming Context: CN=Schema,CN=Configuration ,DC=rmicen tral,DC=co m
The replication generated an error (1727):
Win32 Error 1727
The failure occurred at 2005-06-07 17:25:08.
The last success occurred at 2005-05-21 10:38:15.
672 failures have occurred since the last success.
[Replications Check,CENTRAL-TECH01] A recent replication attempt failed:
From OAS01 to CENTRAL-TECH01
Naming Context: CN=Configuration,DC=rmicen tral,DC=co m
The replication generated an error (1727):
Win32 Error 1727
The failure occurred at 2005-06-07 16:53:30.
The last success occurred at 2005-06-03 20:41:26.
163 failures have occurred since the last success.
[Replications Check,CENTRAL-TECH01] A recent replication attempt failed:
From VERA01 to CENTRAL-TECH01
Naming Context: CN=Configuration,DC=rmicen tral,DC=co m
The replication generated an error (1727):
Win32 Error 1727
The failure occurred at 2005-06-07 16:59:49.
The last success occurred at 2005-05-21 10:38:13.
880 failures have occurred since the last success.
[Replications Check,CENTRAL-TECH01] A recent replication attempt failed:
From HEVER-MAIN to CENTRAL-TECH01
Naming Context: CN=Configuration,DC=rmicen tral,DC=co m
The replication generated an error (1727):
Win32 Error 1727
The failure occurred at 2005-06-07 17:12:29.
The last success occurred at 2005-05-21 10:38:15.
878 failures have occurred since the last success.
[Replications Check,CENTRAL-TECH01] A recent replication attempt failed:
From OAS01 to CENTRAL-TECH01
Naming Context: DC=rmicentral,DC=com
The replication generated an error (1722):
Win32 Error 1722
The failure occurred at 2005-06-03 18:35:25.
The last success occurred at 2005-06-03 10:10:51.
1 failures have occurred since the last success.
[OAS01] DsBindWithSpnEx() failed with error 1727,
Win32 Error 1727.
The source remains down. Please check the machine.
[Replications Check,CENTRAL-TECH01] A recent replication attempt failed:
From VERA01 to CENTRAL-TECH01
Naming Context: DC=rmicentral,DC=com
The replication generated an error (1818):
Win32 Error 1818
The failure occurred at 2005-06-03 20:18:05.
The last success occurred at 2005-05-21 10:38:16.
25 failures have occurred since the last success.
[Replications Check,CENTRAL-TECH01] A recent replication attempt failed:
From HEVER-MAIN to CENTRAL-TECH01
Naming Context: DC=rmicentral,DC=com
The replication generated an error (1727):
Win32 Error 1727
The failure occurred at 2005-06-03 20:55:58.
The last success occurred at 2005-05-21 10:38:16.
25 failures have occurred since the last success.
REPLICATION LATENCY WARNING
CENTRAL-TECH01: A long-running replication operation is in progress
The job has been executing for 3 minutes and 0 seconds.
Replication of new changes along this path will be delayed.
Error: Higher priority replications are being blocked
Enqueued 2005-06-07 16:58:12 at priority 170
Op: SYNC FROM SOURCE
NC CN=Configuration,DC=rmicen tral,DC=co m
DSADN CN=NTDS Settings,CN=OAS01,CN=Serve rs,CN=oasi s,CN=Sites ,CN=Config uration,DC =rmicentra l,DC=com
DSA transport addr ada8d20b-2fe0-487f-a4b0-95 9e272aa43a ._msdcs.rm icentral.c om
REPLICATION-RECEIVED LATENCY WARNING
CENTRAL-TECH01: Current time is 2005-06-07 17:28:08.
CN=Schema,CN=Configuration ,DC=rmicen tral,DC=co m
Last replication recieved from OAS01 at 2005-06-03 20:38:23.
Last replication recieved from HEVER-MAIN at 2005-05-21 10:38:15.
CN=Configuration,DC=rmicen tral,DC=co m
Last replication recieved from OAS01 at 2005-06-03 20:41:26.
Last replication recieved from HEVER-MAIN at 2005-05-21 10:38:14.
DC=rmicentral,DC=com
Last replication recieved from OAS01 at 2005-06-03 10:10:51.
Last replication recieved from HEVER-MAIN at 2005-05-21 10:38:16.
REPLICATION-RECEIVED LATENCY WARNING
Source site:
CN=NTDS Site Settings,CN=oasis,CN=Sites ,CN=Config uration,DC =rmicentra l,DC=com
Current time: 2005-06-07 17:34:27
Last update time: 2005-06-03 20:36:19
Check if source site has an elected ISTG running.
Check replication from source site to this server.
REPLICATION-RECEIVED LATENCY WARNING
Source site:
CN=NTDS Site Settings,CN=Vera,CN=Sites, CN=Configu ration,DC= rmicentral ,DC=com
Current time: 2005-06-07 17:34:27
Last update time: 2005-05-30 16:26:25
Check if source site has an elected ISTG running.
Check replication from source site to this server.
REPLICATION-RECEIVED LATENCY WARNING
Source site:
CN=NTDS Site Settings,CN=Hever,CN=Sites ,CN=Config uration,DC =rmicentra l,DC=com
Current time: 2005-06-07 17:34:27
Last update time: 2005-05-21 10:27:49
Check if source site has an elected ISTG running.
Check replication from source site to this server.
......................... CENTRAL-TECH01 passed test Replications
Starting test: NCSecDesc
......................... CENTRAL-TECH01 passed test NCSecDesc
Starting test: NetLogons
......................... CENTRAL-TECH01 passed test NetLogons
Starting test: Advertising
......................... CENTRAL-TECH01 passed test Advertising
Starting test: KnowsOfRoleHolders
......................... CENTRAL-TECH01 passed test KnowsOfRoleHolders
Starting test: RidManager
......................... CENTRAL-TECH01 passed test RidManager
Starting test: MachineAccount
......................... CENTRAL-TECH01 passed test MachineAccount
Starting test: Services
......................... CENTRAL-TECH01 passed test Services
Starting test: ObjectsReplicated
......................... CENTRAL-TECH01 passed test ObjectsReplicated
Starting test: frssysvol
......................... CENTRAL-TECH01 passed test frssysvol
Starting test: frsevent
......................... CENTRAL-TECH01 passed test frsevent
Starting test: kccevent
An Warning Event occured. EventID: 0x8000061E
Time Generated: 06/07/2005 17:22:25
Event String: All domain controllers in the following site that
An Warning Event occured. EventID: 0x8000061E
Time Generated: 06/07/2005 17:22:25
Event String: All domain controllers in the following site that
An Warning Event occured. EventID: 0x8000061E
Time Generated: 06/07/2005 17:22:25
Event String: All domain controllers in the following site that
An Error Event occured. EventID: 0xC000051F
Time Generated: 06/07/2005 17:22:25
Event String: The Knowledge Consistency Checker (KCC) has
An Warning Event occured. EventID: 0x80000749
Time Generated: 06/07/2005 17:22:25
Event String: The Knowledge Consistency Checker (KCC) was
An Warning Event occured. EventID: 0x8000061E
Time Generated: 06/07/2005 17:22:25
Event String: All domain controllers in the following site that
An Warning Event occured. EventID: 0x8000061E
Time Generated: 06/07/2005 17:22:25
Event String: All domain controllers in the following site that
An Warning Event occured. EventID: 0x8000061E
Time Generated: 06/07/2005 17:22:25
Event String: All domain controllers in the following site that
An Error Event occured. EventID: 0xC000051F
Time Generated: 06/07/2005 17:22:25
Event String: The Knowledge Consistency Checker (KCC) has
An Warning Event occured. EventID: 0x80000749
Time Generated: 06/07/2005 17:22:25
Event String: The Knowledge Consistency Checker (KCC) was
......................... CENTRAL-TECH01 failed test kccevent
Starting test: systemlog
......................... CENTRAL-TECH01 passed test systemlog
Starting test: VerifyReferences
......................... CENTRAL-TECH01 passed test VerifyReferences
Running partition tests on : Schema
Starting test: CrossRefValidation
......................... Schema passed test CrossRefValidation
Starting test: CheckSDRefDom
......................... Schema passed test CheckSDRefDom
Running partition tests on : Configuration
Starting test: CrossRefValidation
......................... Configuration passed test CrossRefValidation
Starting test: CheckSDRefDom
......................... Configuration passed test CheckSDRefDom
Running partition tests on : rmicentral
Starting test: CrossRefValidation
......................... rmicentral passed test CrossRefValidation
Starting test: CheckSDRefDom
......................... rmicentral passed test CheckSDRefDom
Running enterprise tests on : rmicentral.com
Starting test: Intersite
......................... rmicentral.com passed test Intersite
Starting test: FsmoCheck
......................... rmicentral.com passed test FsmoCheck
this is dcdiag from one of the remote sites:
Domain Controller Diagnosis
Performing initial setup:
Done gathering initial info.
Doing initial required tests
Testing server: oasis\OAS01
Starting test: Connectivity
......................... OAS01 passed test Connectivity
Doing primary tests
Testing server: oasis\OAS01
Starting test: Replications
[Replications Check,OAS01] A recent replication attempt failed:
From CENTRAL-TECH01 to OAS01
Naming Context: DC=rmicentral,DC=com
The replication generated an error (1726):
The remote procedure call failed.
The failure occurred at 2005-06-06 15:12.31.
The last success occurred at 2005-06-03 20:35.50.
4 failures have occurred since the last success.
The replication RPC call executed for too long at the server and
was cancelled.
Check load and resouce usage on CENTRAL-TECH01.
[Replications Check,OAS01] A recent replication attempt failed:
From CENTRAL-MAIN to OAS01
Naming Context: DC=rmicentral,DC=com
The replication generated an error (1722):
The RPC server is unavailable.
The failure occurred at 2005-06-07 13:09.58.
The last success occurred at 2005-06-01 09:09.54.
234 failures have occurred since the last success.
netdiag to follow:
Domain Controller Diagnosis
Performing initial setup:
Done gathering initial info.
Doing initial required tests
Testing server: Marbella\CENTRAL-TECH01
Starting test: Connectivity
......................... CENTRAL-TECH01 passed test Connectivity
Doing primary tests
Testing server: Marbella\CENTRAL-TECH01
Starting test: Replications
[Replications Check,CENTRAL-TECH01] A recent replication attempt failed:
From OAS01 to CENTRAL-TECH01
Naming Context: CN=Schema,CN=Configuration
The replication generated an error (1727):
Win32 Error 1727
The failure occurred at 2005-06-07 17:06:11.
The last success occurred at 2005-06-03 20:38:26.
129 failures have occurred since the last success.
[Replications Check,CENTRAL-TECH01] A recent replication attempt failed:
From VERA01 to CENTRAL-TECH01
Naming Context: CN=Schema,CN=Configuration
The replication generated an error (1727):
Win32 Error 1727
The failure occurred at 2005-06-07 17:18:49.
The last success occurred at 2005-05-21 10:38:15.
663 failures have occurred since the last success.
[Replications Check,CENTRAL-TECH01] A recent replication attempt failed:
From HEVER-MAIN to CENTRAL-TECH01
Naming Context: CN=Schema,CN=Configuration
The replication generated an error (1727):
Win32 Error 1727
The failure occurred at 2005-06-07 17:25:08.
The last success occurred at 2005-05-21 10:38:15.
672 failures have occurred since the last success.
[Replications Check,CENTRAL-TECH01] A recent replication attempt failed:
From OAS01 to CENTRAL-TECH01
Naming Context: CN=Configuration,DC=rmicen
The replication generated an error (1727):
Win32 Error 1727
The failure occurred at 2005-06-07 16:53:30.
The last success occurred at 2005-06-03 20:41:26.
163 failures have occurred since the last success.
[Replications Check,CENTRAL-TECH01] A recent replication attempt failed:
From VERA01 to CENTRAL-TECH01
Naming Context: CN=Configuration,DC=rmicen
The replication generated an error (1727):
Win32 Error 1727
The failure occurred at 2005-06-07 16:59:49.
The last success occurred at 2005-05-21 10:38:13.
880 failures have occurred since the last success.
[Replications Check,CENTRAL-TECH01] A recent replication attempt failed:
From HEVER-MAIN to CENTRAL-TECH01
Naming Context: CN=Configuration,DC=rmicen
The replication generated an error (1727):
Win32 Error 1727
The failure occurred at 2005-06-07 17:12:29.
The last success occurred at 2005-05-21 10:38:15.
878 failures have occurred since the last success.
[Replications Check,CENTRAL-TECH01] A recent replication attempt failed:
From OAS01 to CENTRAL-TECH01
Naming Context: DC=rmicentral,DC=com
The replication generated an error (1722):
Win32 Error 1722
The failure occurred at 2005-06-03 18:35:25.
The last success occurred at 2005-06-03 10:10:51.
1 failures have occurred since the last success.
[OAS01] DsBindWithSpnEx() failed with error 1727,
Win32 Error 1727.
The source remains down. Please check the machine.
[Replications Check,CENTRAL-TECH01] A recent replication attempt failed:
From VERA01 to CENTRAL-TECH01
Naming Context: DC=rmicentral,DC=com
The replication generated an error (1818):
Win32 Error 1818
The failure occurred at 2005-06-03 20:18:05.
The last success occurred at 2005-05-21 10:38:16.
25 failures have occurred since the last success.
[Replications Check,CENTRAL-TECH01] A recent replication attempt failed:
From HEVER-MAIN to CENTRAL-TECH01
Naming Context: DC=rmicentral,DC=com
The replication generated an error (1727):
Win32 Error 1727
The failure occurred at 2005-06-03 20:55:58.
The last success occurred at 2005-05-21 10:38:16.
25 failures have occurred since the last success.
REPLICATION LATENCY WARNING
CENTRAL-TECH01: A long-running replication operation is in progress
The job has been executing for 3 minutes and 0 seconds.
Replication of new changes along this path will be delayed.
Error: Higher priority replications are being blocked
Enqueued 2005-06-07 16:58:12 at priority 170
Op: SYNC FROM SOURCE
NC CN=Configuration,DC=rmicen
DSADN CN=NTDS Settings,CN=OAS01,CN=Serve
DSA transport addr ada8d20b-2fe0-487f-a4b0-95
REPLICATION-RECEIVED LATENCY WARNING
CENTRAL-TECH01: Current time is 2005-06-07 17:28:08.
CN=Schema,CN=Configuration
Last replication recieved from OAS01 at 2005-06-03 20:38:23.
Last replication recieved from HEVER-MAIN at 2005-05-21 10:38:15.
CN=Configuration,DC=rmicen
Last replication recieved from OAS01 at 2005-06-03 20:41:26.
Last replication recieved from HEVER-MAIN at 2005-05-21 10:38:14.
DC=rmicentral,DC=com
Last replication recieved from OAS01 at 2005-06-03 10:10:51.
Last replication recieved from HEVER-MAIN at 2005-05-21 10:38:16.
REPLICATION-RECEIVED LATENCY WARNING
Source site:
CN=NTDS Site Settings,CN=oasis,CN=Sites
Current time: 2005-06-07 17:34:27
Last update time: 2005-06-03 20:36:19
Check if source site has an elected ISTG running.
Check replication from source site to this server.
REPLICATION-RECEIVED LATENCY WARNING
Source site:
CN=NTDS Site Settings,CN=Vera,CN=Sites,
Current time: 2005-06-07 17:34:27
Last update time: 2005-05-30 16:26:25
Check if source site has an elected ISTG running.
Check replication from source site to this server.
REPLICATION-RECEIVED LATENCY WARNING
Source site:
CN=NTDS Site Settings,CN=Hever,CN=Sites
Current time: 2005-06-07 17:34:27
Last update time: 2005-05-21 10:27:49
Check if source site has an elected ISTG running.
Check replication from source site to this server.
......................... CENTRAL-TECH01 passed test Replications
Starting test: NCSecDesc
......................... CENTRAL-TECH01 passed test NCSecDesc
Starting test: NetLogons
......................... CENTRAL-TECH01 passed test NetLogons
Starting test: Advertising
......................... CENTRAL-TECH01 passed test Advertising
Starting test: KnowsOfRoleHolders
......................... CENTRAL-TECH01 passed test KnowsOfRoleHolders
Starting test: RidManager
......................... CENTRAL-TECH01 passed test RidManager
Starting test: MachineAccount
......................... CENTRAL-TECH01 passed test MachineAccount
Starting test: Services
......................... CENTRAL-TECH01 passed test Services
Starting test: ObjectsReplicated
......................... CENTRAL-TECH01 passed test ObjectsReplicated
Starting test: frssysvol
......................... CENTRAL-TECH01 passed test frssysvol
Starting test: frsevent
......................... CENTRAL-TECH01 passed test frsevent
Starting test: kccevent
An Warning Event occured. EventID: 0x8000061E
Time Generated: 06/07/2005 17:22:25
Event String: All domain controllers in the following site that
An Warning Event occured. EventID: 0x8000061E
Time Generated: 06/07/2005 17:22:25
Event String: All domain controllers in the following site that
An Warning Event occured. EventID: 0x8000061E
Time Generated: 06/07/2005 17:22:25
Event String: All domain controllers in the following site that
An Error Event occured. EventID: 0xC000051F
Time Generated: 06/07/2005 17:22:25
Event String: The Knowledge Consistency Checker (KCC) has
An Warning Event occured. EventID: 0x80000749
Time Generated: 06/07/2005 17:22:25
Event String: The Knowledge Consistency Checker (KCC) was
An Warning Event occured. EventID: 0x8000061E
Time Generated: 06/07/2005 17:22:25
Event String: All domain controllers in the following site that
An Warning Event occured. EventID: 0x8000061E
Time Generated: 06/07/2005 17:22:25
Event String: All domain controllers in the following site that
An Warning Event occured. EventID: 0x8000061E
Time Generated: 06/07/2005 17:22:25
Event String: All domain controllers in the following site that
An Error Event occured. EventID: 0xC000051F
Time Generated: 06/07/2005 17:22:25
Event String: The Knowledge Consistency Checker (KCC) has
An Warning Event occured. EventID: 0x80000749
Time Generated: 06/07/2005 17:22:25
Event String: The Knowledge Consistency Checker (KCC) was
......................... CENTRAL-TECH01 failed test kccevent
Starting test: systemlog
......................... CENTRAL-TECH01 passed test systemlog
Starting test: VerifyReferences
......................... CENTRAL-TECH01 passed test VerifyReferences
Running partition tests on : Schema
Starting test: CrossRefValidation
......................... Schema passed test CrossRefValidation
Starting test: CheckSDRefDom
......................... Schema passed test CheckSDRefDom
Running partition tests on : Configuration
Starting test: CrossRefValidation
......................... Configuration passed test CrossRefValidation
Starting test: CheckSDRefDom
......................... Configuration passed test CheckSDRefDom
Running partition tests on : rmicentral
Starting test: CrossRefValidation
......................... rmicentral passed test CrossRefValidation
Starting test: CheckSDRefDom
......................... rmicentral passed test CheckSDRefDom
Running enterprise tests on : rmicentral.com
Starting test: Intersite
......................... rmicentral.com passed test Intersite
Starting test: FsmoCheck
......................... rmicentral.com passed test FsmoCheck
this is dcdiag from one of the remote sites:
Domain Controller Diagnosis
Performing initial setup:
Done gathering initial info.
Doing initial required tests
Testing server: oasis\OAS01
Starting test: Connectivity
......................... OAS01 passed test Connectivity
Doing primary tests
Testing server: oasis\OAS01
Starting test: Replications
[Replications Check,OAS01] A recent replication attempt failed:
From CENTRAL-TECH01 to OAS01
Naming Context: DC=rmicentral,DC=com
The replication generated an error (1726):
The remote procedure call failed.
The failure occurred at 2005-06-06 15:12.31.
The last success occurred at 2005-06-03 20:35.50.
4 failures have occurred since the last success.
The replication RPC call executed for too long at the server and
was cancelled.
Check load and resouce usage on CENTRAL-TECH01.
[Replications Check,OAS01] A recent replication attempt failed:
From CENTRAL-MAIN to OAS01
Naming Context: DC=rmicentral,DC=com
The replication generated an error (1722):
The RPC server is unavailable.
The failure occurred at 2005-06-07 13:09.58.
The last success occurred at 2005-06-01 09:09.54.
234 failures have occurred since the last success.
netdiag to follow:
DCDiag is a good idea as Joe suggests. Rather than running it on all DCs individually you might want to try:
dcdiag /e /c /v /f:output.txt
Which runs comprehensive tests against all DCs in the enterprise and outputs everything to the file output.txt (there will be a pretty huge amount of output). That along with NetDiag (again as Joe suggests) should be able to point you towards some kind of failure.
dcdiag /e /c /v /f:output.txt
Which runs comprehensive tests against all DCs in the enterprise and outputs everything to the file output.txt (there will be a pretty huge amount of output). That along with NetDiag (again as Joe suggests) should be able to point you towards some kind of failure.
ASKER
nslookup works - it can resolve fine, but the nslookup from dos looks like this from the server at the remote site:
default server: unknown
address: 192.168.20.20
at the head office site it is correct. Not sure if this is relevant. It seems to suggest that the DNS server doesnt know its own name....??
default server: unknown
address: 192.168.20.20
at the head office site it is correct. Not sure if this is relevant. It seems to suggest that the DNS server doesnt know its own name....??
"Default server: unknown" means that it's missing a Reverse Lookup zone for 192.168.20.x.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
DCs are not SP1.
Time sync is OK - both are near enough the right time (about 2 seconds difference).
Time sync is OK - both are near enough the right time (about 2 seconds difference).
ASKER
The RPC MTU issue sounds like it could be the problem, but as I say, the two servers we are looking at here are not service packed.
They are both windows 2003 std edition.
netdiag from our head office bridgehead server produces the following:
.......................... ..........
Computer Name: CENTRAL-TECH01
DNS Host Name: central-tech01.rmicentral. com
System info : Windows 2000 Server (Build 3790)
Processor : x86 Family 15 Model 2 Stepping 9, GenuineIntel
List of installed hotfixes :
KB282010
KB817789
KB819696
KB823182
KB823353
KB823559
KB824105
KB824141
KB824146
KB824151
KB825119
KB828035
KB828741
KB828750
KB832894
KB833987
KB834707
KB835732
KB837001
KB837009
KB839643
KB839645
KB840315
KB840374
KB840987
KB841356
KB841533
KB842773
KB867282
KB867460
KB871250
KB873333
KB873376
KB885250
KB885834
KB885835
KB885836
KB886903
KB888113
KB890047
KB890175
KB890859
KB890923
KB891711
KB891781
KB893066
KB893086
KB893803
KB893803v2
Q147222
Q828026
Netcard queries test . . . . . . . : Passed
Per interface results:
Adapter : Lan 1
Netcard queries test . . . : Passed
Host Name. . . . . . . . . : central-tech01
IP Address . . . . . . . . : 192.168.0.26
Subnet Mask. . . . . . . . : 255.255.255.0
Default Gateway. . . . . . : 192.168.0.250
Dns Servers. . . . . . . . : 192.168.0.26
AutoConfiguration results. . . . . . : Passed
Default gateway test . . . : Passed
NetBT name test. . . . . . : Passed
[WARNING] At least one of the <00> 'WorkStation Service', <03> 'Messenger Service', <20> 'WINS' names is missing.
WINS service test. . . . . : Skipped
There are no WINS servers configured for this interface.
Global results:
Domain membership test . . . . . . : Passed
NetBT transports test. . . . . . . : Passed
List of NetBt transports currently configured:
NetBT_Tcpip_{CF20C948-BEA2 -4F8C-A790 -BE6D02FDC 2F3}
1 NetBt transport currently configured.
Autonet address test . . . . . . . : Passed
IP loopback ping test. . . . . . . : Passed
Default gateway test . . . . . . . : Passed
NetBT name test. . . . . . . . . . : Passed
[WARNING] You don't have a single interface with the <00> 'WorkStation Service', <03> 'Messenger Service', <20> 'WINS' names defined.
Winsock test . . . . . . . . . . . : Passed
DNS test . . . . . . . . . . . . . : Passed
PASS - All the DNS entries for DC are registered on DNS server '192.168.0.26' and other DCs also have some of the names registered.
Redir and Browser test . . . . . . : Passed
List of NetBt transports currently bound to the Redir
NetBT_Tcpip_{CF20C948-BEA2 -4F8C-A790 -BE6D02FDC 2F3}
The redir is bound to 1 NetBt transport.
List of NetBt transports currently bound to the browser
NetBT_Tcpip_{CF20C948-BEA2 -4F8C-A790 -BE6D02FDC 2F3}
The browser is bound to 1 NetBt transport.
DC discovery test. . . . . . . . . : Passed
DC list test . . . . . . . . . . . : Passed
Trust relationship test. . . . . . : Passed
Secure channel for domain 'RMICENTRAL' is to '\\CENTRAL-MAIN.rmicentral .com'.
Kerberos test. . . . . . . . . . . : Passed
LDAP test. . . . . . . . . . . . . : Passed
[WARNING] Failed to query SPN registration on DC 'OAS01.rmicentral.com'.
[WARNING] Failed to query SPN registration on DC 'vera01.rmicentral.com'.
[WARNING] Failed to query SPN registration on DC 'HEVER-MAIN.rmicentral.com '.
Bindings test. . . . . . . . . . . : Passed
WAN configuration test . . . . . . : Skipped
No active remote access connections.
Modem diagnostics test . . . . . . : Passed
IP Security test . . . . . . . . . : Skipped
Note: run "netsh ipsec dynamic show /?" for more detailed information
The command completed successfully
They are both windows 2003 std edition.
netdiag from our head office bridgehead server produces the following:
..........................
Computer Name: CENTRAL-TECH01
DNS Host Name: central-tech01.rmicentral.
System info : Windows 2000 Server (Build 3790)
Processor : x86 Family 15 Model 2 Stepping 9, GenuineIntel
List of installed hotfixes :
KB282010
KB817789
KB819696
KB823182
KB823353
KB823559
KB824105
KB824141
KB824146
KB824151
KB825119
KB828035
KB828741
KB828750
KB832894
KB833987
KB834707
KB835732
KB837001
KB837009
KB839643
KB839645
KB840315
KB840374
KB840987
KB841356
KB841533
KB842773
KB867282
KB867460
KB871250
KB873333
KB873376
KB885250
KB885834
KB885835
KB885836
KB886903
KB888113
KB890047
KB890175
KB890859
KB890923
KB891711
KB891781
KB893066
KB893086
KB893803
KB893803v2
Q147222
Q828026
Netcard queries test . . . . . . . : Passed
Per interface results:
Adapter : Lan 1
Netcard queries test . . . : Passed
Host Name. . . . . . . . . : central-tech01
IP Address . . . . . . . . : 192.168.0.26
Subnet Mask. . . . . . . . : 255.255.255.0
Default Gateway. . . . . . : 192.168.0.250
Dns Servers. . . . . . . . : 192.168.0.26
AutoConfiguration results. . . . . . : Passed
Default gateway test . . . : Passed
NetBT name test. . . . . . : Passed
[WARNING] At least one of the <00> 'WorkStation Service', <03> 'Messenger Service', <20> 'WINS' names is missing.
WINS service test. . . . . : Skipped
There are no WINS servers configured for this interface.
Global results:
Domain membership test . . . . . . : Passed
NetBT transports test. . . . . . . : Passed
List of NetBt transports currently configured:
NetBT_Tcpip_{CF20C948-BEA2
1 NetBt transport currently configured.
Autonet address test . . . . . . . : Passed
IP loopback ping test. . . . . . . : Passed
Default gateway test . . . . . . . : Passed
NetBT name test. . . . . . . . . . : Passed
[WARNING] You don't have a single interface with the <00> 'WorkStation Service', <03> 'Messenger Service', <20> 'WINS' names defined.
Winsock test . . . . . . . . . . . : Passed
DNS test . . . . . . . . . . . . . : Passed
PASS - All the DNS entries for DC are registered on DNS server '192.168.0.26' and other DCs also have some of the names registered.
Redir and Browser test . . . . . . : Passed
List of NetBt transports currently bound to the Redir
NetBT_Tcpip_{CF20C948-BEA2
The redir is bound to 1 NetBt transport.
List of NetBt transports currently bound to the browser
NetBT_Tcpip_{CF20C948-BEA2
The browser is bound to 1 NetBt transport.
DC discovery test. . . . . . . . . : Passed
DC list test . . . . . . . . . . . : Passed
Trust relationship test. . . . . . : Passed
Secure channel for domain 'RMICENTRAL' is to '\\CENTRAL-MAIN.rmicentral
Kerberos test. . . . . . . . . . . : Passed
LDAP test. . . . . . . . . . . . . : Passed
[WARNING] Failed to query SPN registration on DC 'OAS01.rmicentral.com'.
[WARNING] Failed to query SPN registration on DC 'vera01.rmicentral.com'.
[WARNING] Failed to query SPN registration on DC 'HEVER-MAIN.rmicentral.com
Bindings test. . . . . . . . . . . : Passed
WAN configuration test . . . . . . : Skipped
No active remote access connections.
Modem diagnostics test . . . . . . : Passed
IP Security test . . . . . . . . . : Skipped
Note: run "netsh ipsec dynamic show /?" for more detailed information
The command completed successfully
ASKER
Had to deal with MS to fix this in the end. Was given MS05-019 to patch, also hot fixes 899148 and 898060, and registry fixes as follows:
1. Click "Start", click "Run", type "regedit" (without the quotation marks), and then click "OK"
2. Locate and then click the following registry subkey:
HKEY_LOCAL_MACHINE\SOFTWAR E\Policies \Microsoft \Windows NT\Rpc
3. Click the "Edit" menu, point to "New", and then click "DWORD Value".
4. Type "Server2003NegotiateDisabl e" (without the quotation marks) as the name of the new DWORD Value.
5. Right-click " Server2003NegotiateDisable ", and then click "Modify".
6. In the "Value Data" box, type "1", and then click "OK".
7. Quit Registry Editor. Restart the Windows Server 2003-based computer.
http://support.microsoft.com/?id=898060 should also give an idea of what happened.
I pointed out to MS that this was an issue caused by MS patches and suggested that support fees should be waived. They agreed, but I wonder had I not said anything whether they might just have taken my money anyway...
1. Click "Start", click "Run", type "regedit" (without the quotation marks), and then click "OK"
2. Locate and then click the following registry subkey:
HKEY_LOCAL_MACHINE\SOFTWAR
3. Click the "Edit" menu, point to "New", and then click "DWORD Value".
4. Type "Server2003NegotiateDisabl
5. Right-click " Server2003NegotiateDisable
6. In the "Value Data" box, type "1", and then click "OK".
7. Quit Registry Editor. Restart the Windows Server 2003-based computer.
http://support.microsoft.com/?id=898060 should also give an idea of what happened.
I pointed out to MS that this was an issue caused by MS patches and suggested that support fees should be waived. They agreed, but I wonder had I not said anything whether they might just have taken my money anyway...