Solved

Active Directory Replication & DFS Not Working

Posted on 2016-10-24
10
49 Views
Last Modified: 2016-10-26
I have a domain with three domain controllers on three different sites. Two of the domain controllers have DFS setup between them

The servers are :

WIN-ADC-SRVR1 which is in the main office and holds the FSMO roles and has DFS setup to WIN-ADC-SRVR2
WIN-ADC-SRVR2 which is in the factory and has DFS setup to WIN-ADC-SRVR1
WIN-ADC-SRVR3 which is located in another office and is used for just local logon authentication.

Since Sunday the AD replication has stopped working and so has the DFS replication on the data folder that is setup between WIN-ADC-SRVR1 AND WIN-ADC-SRVR2 but it would seem that active directory replication is working between WIN-ADC-SRVR1 AND WIN-ADC-SRVR3.

I have checked DNS and all seems to be working there and the servers can resolve each other via IP address and FQDN and even the SID listed in the DNS server.

Nothing has changed with the VPN links between the sites and although the links are just ADSL the data sent between the office and the factory server is only very small and there has never been a problem of this scale.

If anyone can provide any information that could help me with this issue, it would be greatly appreciated.  If logs are needed please let me know.

I have already attached the dcdiag log as well as the dcdiagDNS and repadmin tests from each server

Thanks Andy
LOGS.zip
0
Comment
Question by:AndyBooker1
  • 5
  • 5
10 Comments
 
LVL 39

Expert Comment

by:footech
ID: 41857866
I would investigate the errors you're seeing in regards to KccEvent.  Dig down in the details and result codes from the replication errors (those should also be in the event logs).

If I were you I would configure the NIC settings on each DC to use itself as preferred DNS, and another DC as alternate.  Normally I'd recommend putting another as preferred and itself as alternate, but you've only got one DC per site.
0
 

Author Comment

by:AndyBooker1
ID: 41857933
The NIC setting on the DC on all site is set to 127.0.0.1, which is what it has always been.

I have had a look at the errors in the logs and most of them point to the remote procedure call has failed, but the service is running on all of the servers and is set to automatic.  When I have googled the issue, it mostly talks about DNS.

The DNS is working fine as I can ping each of the server my name and it resolves.  Also I can ping the server by its SID in DNS which also resolved to the correct IP address.
0
 
LVL 39

Expert Comment

by:footech
ID: 41858577
That doesn't change my recommendation.

Just a minor detail, but it's a GUID for the DC which is in DNS, not a SID.

Everything I've seen so far points to RPC not working between WIN-ADC-SRVR1 and WIN-ADC-SRVR2.  You can try using portquery from MS to test.  Also try this command from WIN-ADC-SRVR1
repadmin /bind WIN-ADC-SRVR1 (and the reverse).  Since WIN-ADC-SRVR3 doesn't seem to have any trouble communicating that would rule out a service issue.  More likely that a firewall or other connection issue is involved.

It might be helpful if you post the actual event IDs (and their sources) which you are seeing.

Also, it appears that you don't have all sites defined in Sites and Services.
0
 

Author Comment

by:AndyBooker1
ID: 41858713
Hi,

Thanks for the correction of GUID not SID.  

I have run repadmin /bind WIN-ADC-SRVR1 on win-adc-srvr1 and have run the same command on win-adc-srvr2.
I will update on any changes if any.  I also did think it might be a firewall issue but it is disabled on all servers with GP and the service isn't even running.

Also you mention i dont have the all the site defined, there are three 1. Office 2. Factory 3 Hornchurch, as far as I can see these are all defined as they should, but please correct me if I am wrong.
0
 

Author Comment

by:AndyBooker1
ID: 41858728
Hi,

Ok After running repadmin /bind WIN-ADC-SRVR1 on all the servers and then running repadmin /showrepl and
repadmin /replsummary all the servers are now replicating as they should do.

As I said in my last comment if you explain where I have not setup the site and services correctly it would be apppricated and also what the command repadmin /bind WIN-ADC-SRVR1 does as well, if that too much trouble.

Thanks for your help.

Andy
25-10-2016-15-44-03.png
0
 
LVL 39

Accepted Solution

by:
footech earned 500 total points
ID: 41859011
Sorry, I meant to say "subnets", not sites.
The reason is the event in the system log.
During the past 4.00 hours there have been 173 connections to this Domain Controller from client machines whose IP addresses don't map to any of the existing sites in the enterprise. Those clients, therefore, have undefined sites and may connect to any Domain Controller including those that are in far distant locations from the clients.

Don't you have any network/edge firewalls between the sites?

I didn't expect the repadmin /bind command to actually fix anything, but more as a test for communication.  All it does is try to establish an LDAP connection and display some info on features.
I also had a slight misstype.  I meant run
repadmin /bind WIN-ADC-SRVR2 on win-adc-srvr1
and
repadmin /bind WIN-ADC-SRVR1 on win-adc-srvr2

But if all's working now, then great (and I won't worry further)!
I could only speculate that somehow communication got interrupted, and the repadmin command re-established it.  But why it wouldn't re-establish by itself I don't know.
0
 

Author Comment

by:AndyBooker1
ID: 41859136
Hi,

The subnets are defined in the sites and servers see the screen shot attached, so not to sure what you mean, if there is anything missing can you point it out, as I would like the configuration to be correct.  But this has always been as it is and has not changed.

There is a firewall on the each site, on the cisco routers which also provide the VPN tunnels.  But between the site to site links there is no firewall protection.

If you can elaborate on the site and services, it would be greatly appreciated. I will then close off the question.

Thanks very much for your help, I was getting to the stay where I was at a bit of a loss.  Lets hope this is a permanent fix, and it doesn't break in a couple of weeks time, which I have seen before.
25-10-2016-19-12-17.png
0
 
LVL 39

Assisted Solution

by:footech
footech earned 500 total points
ID: 41859706
I can't tell you any more than the event text.  You would have to examine the log file to see what the IPs are that are connecting.  If those IPs should always try to authenticate to a DC in a particular site, then define the subnet and attach it to the site.  Just by selecting "Subnets" in the left pane in Sites and Services you can see in the right pane which site they are attached to.
0
 

Author Closing Comment

by:AndyBooker1
ID: 41860388
Thanks for you input on this issue your help has been gratefully received and is appreciated
0
 
LVL 39

Expert Comment

by:footech
ID: 41860561
Glad things are working for you.
0

Join & Write a Comment

Find out how to use Active Directory data for email signature management in Microsoft Exchange and Office 365.
Restoring deleted objects in Active Directory has been a standard feature in Active Directory for many years, yet some admins may not know what is available.
This tutorial will walk an individual through the steps necessary to join and promote the first Windows Server 2012 domain controller into an Active Directory environment running on Windows Server 2008. Determine the location of the FSMO roles by lo…
This tutorial will walk an individual through the process of transferring the five major, necessary Active Directory Roles, commonly referred to as the FSMO roles to another domain controller. Log onto the new domain controller with a user account t…

760 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

22 Experts available now in Live!

Get 1:1 Help Now