?
Solved

Intersite AD Replication Issue

Posted on 2011-04-27
36
Medium Priority
?
476 Views
Last Modified: 2012-06-27
I have a site that is connected to our main newtork via VPN.
I have configured the sunbnets for the site and everything seemed to be working fine. Foolishly I never checked the replication status and now when I run DCDIAG on the site Domain Controller I  get Latency errors and also Warnings that the Domain Owner, PDC Owner, Rid Owner Infrastructure Owner is not responding.

When running Netdiag on the site DC it passes but warns "Failed to query SPN registration on DC <DC1>" and "Failed to query SPN registration on DC <DC2>"

So I've obviously got a replication/topology problem.. any ideas how to resolve?

Thanks in advance.
0
Comment
Question by:GregBooth
  • 17
  • 13
  • 5
  • +1
36 Comments
 
LVL 15

Expert Comment

by:JBond2010
ID: 35473306
Have you got Reverse DNS Zones setup in DNS? I would suggest setting up Reverse DNS Zones with PTR records pointing to all DCs.
0
 
LVL 5

Expert Comment

by:rajkr2020
ID: 35473308
0
 
LVL 7

Expert Comment

by:d3ath5tar
ID: 35473330
Assuming you can ping/traceroute to each other (i.e. your networking is sound...)? If not where does it get stuck? If you can't ping dc to dc what about site to site generally? You miht want to try pinging with a smaller MTU to see if that gets through....

If networking is fine.....

Most AD issues are caused by DNS. Are you getting any errors relating to service records? Are the DC's registered correctly on both DNS servers?

May also be an RPC endpoint issue.... try connecting logging on to the dc consoles locally and open ntdsutil and try to connect to the dc ntds instance;

run NTDSUtil
type Metadata cleanup, press enter
type Connections, press enter
type Connect to server localhost

Do you get a BindDSw error?
0
Visualize your virtual and backup environments

Create well-organized and polished visualizations of your virtual and backup environments when planning VMware vSphere, Microsoft Hyper-V or Veeam deployments. It helps you to gain better visibility and valuable business insights.

 

Author Comment

by:GregBooth
ID: 35473401
DNS and reverse DNS is setup, pinging site works no problem, VPN working no problem.
When I run the NTDSUTIL on the remote site DC and connect to localhost it connects ok. If I run NTDSUTIL on the site DC and connect to my DC at head office it says "The RPC server is unavailable"

:(
0
 
LVL 15

Expert Comment

by:JBond2010
ID: 35473654
Have you checked the firewalls on the DCs to make sure there not blocking communication?
0
 
LVL 15

Expert Comment

by:JBond2010
ID: 35473664
You could try from the command prompt netstat -a to make sure the DCs are listening on the port number for RPC.
0
 

Author Comment

by:GregBooth
ID: 35473810
ref Firewalls I have an IPSEC VPN configured between a DrayTek Vigor2710 ADSL router and a Billion BiGuard S10 Firewall. I am assuming that RPC will be tunnelled through the VPN and not be filtered... is that a wrong assumption?
0
 
LVL 15

Expert Comment

by:JBond2010
ID: 35473829
That's correct. I was referring to the firewalls on the DCs "Windows Firewall".
0
 
LVL 15

Expert Comment

by:JBond2010
ID: 35473851
Have you any Stale Domain Controllers in your Domain that you might have demoted recently?
0
 
LVL 7

Expert Comment

by:d3ath5tar
ID: 35473857
Has the ISTP generated/removed the replication links in Sites and Services?
0
 

Author Comment

by:GregBooth
ID: 35473861
When I go to the Site DC - AD Sites and Services and try to manually start replication to DC on Default-First-Site I get " The naming context is in the process of being removed or is not replicated from the specified server"

For some reason it seems that the replication issues are because of the VPN conenction.
0
 
LVL 7

Expert Comment

by:d3ath5tar
ID: 35473867
If you do a "net share" on each DC are the replication shares (sysvol etc) listed?
0
 
LVL 15

Expert Comment

by:JBond2010
ID: 35473896
In Active Directory Sites and Services you should be moving each DC to its correct Site and Subnet. They should not be Default-Site-First.
0
 

Author Comment

by:GregBooth
ID: 35473929
The site DC is in it's own site with correct subnet. The HQ domain controllers are on Default-First-Site with correct Subnet. According to the Mircrosoft info I found this should not be a problem. Is it?
0
 
LVL 7

Expert Comment

by:d3ath5tar
ID: 35473938
Each DC should be in its own site with its own subnet assigned.... if the DC is promo'd onsite this should happen automatically... if not it will require moving...
0
 
LVL 15

Expert Comment

by:JBond2010
ID: 35474107
@ GregBooth, would it be possible for you to upload the output from DCdiag report? Also, can you try using a combination of portqry and telnet to see if there is connectivity for Active Directory such port 389, 135 etc, between the 2 sites.


Thank you,

JBond2010
0
 

Author Comment

by:GregBooth
ID: 35474110
When doing the following

portqry -n <problem_server> -o 1094,1025,1029,6004
Each Port Query returns NOT LISTENING

To me this is more evidence that it's an RPC issue and that RPC traffic is not getting through my VPN?
0
 
LVL 15

Expert Comment

by:JBond2010
ID: 35474125
This is definitely the problem.
0
 
LVL 15

Expert Comment

by:JBond2010
ID: 35474133
If you have any filters applied try disabling them.
0
 
LVL 15

Expert Comment

by:JBond2010
ID: 35474192
Also, check the Windows Firewall on the DCs and make sure AD ports are not blocked.
0
 

Author Comment

by:GregBooth
ID: 35474291
Windows Firewall is not an issue where are filters applied?
The Site DC was setup at HQ prior to going onsite and there were no issues when it was at HQ.

Thanks everyone for their help so far.
0
 
LVL 15

Expert Comment

by:JBond2010
ID: 35474334
Are the Windows Firewall on the DCs enabled? If so turn off the Windows Firewall to test the replication. Also, were there any changes make on the Router/Firewall VPN? This is where you need to check if there are any filters applied. Check any Firewall Rules or Filters on your Router/Firewall and see if this is blocking the AD ports.
0
 

Author Comment

by:GregBooth
ID: 35475412
Windows Firewall disabled, no filters on firewall/router.

When checking Operations Masters in ADUC RID, PDC and Infrastructure marked as ERROR

Starting toi lose the will to live. LOL
0
 
LVL 15

Expert Comment

by:JBond2010
ID: 35475442
Can you run DCdiag on both DCs and upload the output information from both.


Thank you,

JBond2010
0
 

Author Comment

by:GregBooth
ID: 35475655
I've attached the DCDIAG outputs from my 3 DC's

I have 2 DC's at HQ 1 running PDC, RID GC etc and is the main DC with a second DC running in case DC1 is ever down.

I then have the DC that is connected via VPN to our remote Site.

 HQDC1.log HQDC2.log site-dc.log
0
 
LVL 15

Expert Comment

by:JBond2010
ID: 35475732
Ok GregBooth, I have gotten to the source of the problem looking through the site-dc.log. The is with this error -

This latency is over the Tombstone Lifetime of 60 days!
0
 
LVL 15

Assisted Solution

by:JBond2010
JBond2010 earned 600 total points
ID: 35475871
You can try and force replication by editing the registry and creating a DWord value. I still think this maybe a problem with Router/Firewall. I think the Router/Firewall maybe dropping packets. What is the MTU size on the Routers/Firewall on both sides? The reason I suspect it is the Router/Firewall is because you cannot do a portqry/or Telnet on the AD ports.

Refer to these 2 links below and see do they provide any help.

http://technet.microsoft.com/en-us/library/cc757610(WS.10).aspx

http://www.petri.co.il/forums/showthread.php?t=8827

Please exercise caution if you make any changes to the registry. Make sure to do a backup of the registry ok?
0
 

Author Comment

by:GregBooth
ID: 35481216
MTU size on both sides is 1500. I assume I'd have to do the MTU registry change to all DC's?

Thanks.
0
 

Author Comment

by:GregBooth
ID: 35481230
Pinging all my DC's allow MTU's up to 1472.
0
 

Author Comment

by:GregBooth
ID: 35481249
When I do Portqry 3268 and 3269 locally on my site DC it's NOT LISTENING. I'm assuming it need to listen on 3268 and 3269 for AD replication?
0
 
LVL 15

Expert Comment

by:JBond2010
ID: 35482286
Ports 3268 and 3269 are for the Global Catalog. Are the DCs GLobal Catalogs?
0
 

Author Comment

by:GregBooth
ID: 35483704
portqry 135 to my site DC is now LISTENING as are the GC ports.

I decided to try to demote Site DC and then Promote again but when running DCPROMO it fails with "Logon Failure: The target account name is incorrect"
0
 
LVL 15

Expert Comment

by:JBond2010
ID: 35483747
The fact that you can portqry 135 and GC ports is good news.
0
 
LVL 15

Expert Comment

by:JBond2010
ID: 35484336
To get around the issue of "Logon Failure: The target account name is incorrect" wait for replication to complete. Replication should now be working because the ports are responding.
0
 

Author Comment

by:GregBooth
ID: 35698032
Still no joy replicating.

DCDIAG still displaying Tombstone warnings and Domain Owner, PDC Owner, Rid Owner and INfrastructure Owner not responding!

:-(

I can't do a DC promo because it's not replicated... should I demote it and specify it as last Server in Forest? Then manually delete it from the HQ DC's?
0
 
LVL 7

Accepted Solution

by:
d3ath5tar earned 1400 total points
ID: 35702326
if you specify it as last server in the forest that will destroy your domain/forest....

Can you confirm that your ping to all your DCs from each others sites (regardless of other errors)?
Can you confirm that are definitely no firewalls blocking the required ports for active directory replication?
Does the FSMO Holder with PDC Role etc etc think it is in good shape to itself?

If you can confirm all of the above then I would consider killing your non PDC server. Just kill it dead. If you demote it by the sounds of things it wont remove itself from the other server anyway. Follow this link for removing dead DCs. Do this on your FSMO Server to remove your non FSMO Server.
http://www.petri.co.il/delete_failed_dcs_from_ad.htm

Then rebuild fresh (new name etc if possible) to a flat domain member.
Make sure in AD Sites and Services both sites are setup with their own site supporting their own local subnet.
Then DC Promo the rebuild box back in as a DC in existing Domain.




0

Featured Post

NEW Veeam Backup for Microsoft Office 365 1.5

With Office 365, it’s your data and your responsibility to protect it. NEW Veeam Backup for Microsoft Office 365 eliminates the risk of losing access to your Office 365 data.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Had a business requirement to store the mobile number in an environmental variable. This is just a quick article on how this was done.
It’s time for spooky stories and consuming way too much sugar, including the many treats we’ve whipped for you in the world of tech. Check it out!
This tutorial will walk an individual through the process of configuring their Windows Server 2012 domain controller to synchronize its time with a trusted, external resource. Use Google, Bing, or other preferred search engine to locate trusted NTP …
Microsoft Active Directory, the widely used IT infrastructure, is known for its high risk of credential theft. The best way to test your Active Directory’s vulnerabilities to pass-the-ticket, pass-the-hash, privilege escalation, and malware attacks …

839 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question