?
Solved

2008 R2 DFS Replication Error

Posted on 2012-05-11
15
Medium Priority
?
6,791 Views
Last Modified: 2012-05-22
I am getting this error on my 2008 R2 boxes.  They are all going over VPN's.  I have applied all of the hotfixes that are currently available and none of them have fixed the issue.

The DFS Replication service is stopping communication with partner Server1 for replication group mydomain\shares\documents due to an error. The service will retry the connection periodically.

Additional Information:
Error: 1726 (The remote procedure call failed.)
Connection ID: 2875A428-38DA-4E70-AC7E-F9F234E5F163
Replication Group ID: 9FBCE96A-0E91-438E-94C1-C0EB3F83BF8B

I am aware that it may be on my network.  We have Cisco 1811's doing the WAN VPN.  The only timeout policy in the routers is

ip http timeout-policy idle 60 life 86400 requests 10000

The service is crashing every minute, so does the 60 here refer to seconds or minutes?  If this refers to seconds then this could be the culprit.

The weird thing is, there were never any issues when we were on 2003 R2.  We recently began upgrading all our servers to Server 2008 R2.

Any ideas what may be the root cause here?
0
Comment
Question by:considerscs
  • 10
  • 5
15 Comments
 
LVL 31

Expert Comment

by:Rich Weissler
ID: 37960717
I don't have the answer, but looking at what you have -- lets see if we can't find an answer.

I don't think the http timeout policy should be affecting this.  We're looking at an RPC timeout... I think if the PIX configuration was working in 2003, it should be working now.

There are some folks over on the technet forums who indicate that the problem could be a permission problem on the DFS Computer object in Active Directory, which might be worth a look.

You've been upgrading the DFS servers to 2008R2.  Have you already upgraded the domain controllers as well?
0
 
LVL 1

Author Comment

by:considerscs
ID: 37964762
The domain is 2008 still.  We plan to move that but much later on.  

But we do have 2008 R2 domain controllers in another company and their 2008 R2 DFS servers are having the same problem.

And the weird thing is, that it is only on the 2008 R2 that this is happening.  The ones that are still 2003 R2 do not have this issue.

I have tried the post previously.  All the permissions are correct on the computer computer object in Active Directory.

The boxes are replicating sometimes, but very slow.  And then when I get the error, DFS crashes out and sits idle and does not replicate even though it says it is.
0
 
LVL 31

Expert Comment

by:Rich Weissler
ID: 37964954
Okay, additional information collection... not suggesting any changes...

I assume because you still have some 2003 R2 DFS servers that the namespaces haven't been migrated to 2008 yet.

Was the upgrade an in-place upgrade of the operating system to the new version, or a fresh install of Windows 2008 R2?

I think the 1726 error is a symptom.  It's letting you know it can't communicate, and it would make sense for that instance to stop replicating at that point, because it has said it can't... but the service will continue to try periodically.  Is it possible there is a corresponding message on the other server, or another message in the system or application log?  Or even an audit failure in the security log on either servers participating in replication?
0
What does it mean to be "Always On"?

Is your cloud always on? With an Always On cloud you won't have to worry about downtime for maintenance or software application code updates, ensuring that your bottom line isn't affected.

 
LVL 1

Author Comment

by:considerscs
ID: 37965002
This was a fresh install of 2008 R2.  The old folder target and replication membership was removed prior to disjoining the old box from the domain.

Below is the error I receive on the Hub member.  The Hub member is also a 2008 R2 server.

The DFS Replication service is stopping communication with partner server2 for replication group mycompany\shares\documents due to an error. The service will retry the connection periodically.
 
Additional Information:
Error: 1727 (The remote procedure call failed and did not execute.)
Connection ID: 1AC65D8D-5CF5-4670-A0D9-F5F9532C8F32
Replication Group ID: 9FBCE96A-0E91-438E-94C1-C0EB3F83BF8B
0
 
LVL 31

Assisted Solution

by:Rich Weissler
Rich Weissler earned 2000 total points
ID: 37965107
http://support.microsoft.com/kb/832017
Looking down at 'Distributed File System Replication'.

Is it possible that tcp/5722 is being blocked or filtered on the router/firewall/vpn boxes?  

One other possible change between 2003 and 2008 -- apparently in 2003, it would start it's random port allocation at tcp/1024, and count up from there.  2008 starts tcp/49152.  (And there are pointers to instructions on changing/customizing that port range at the bottom of the Microsoft support document.)  Could the router/firewall/vpn be intercepting the traffic?
0
 
LVL 1

Author Comment

by:considerscs
ID: 37965314
The firewalls are all disabled due to software requirements by our EMR.  But I went in and put route rules into windows firewall on the servers for 5722 just in case.

I am awaiting to see if it continues to throw the error.  For now I am getting the following error.

The DFS Replication service failed to communicate with partner server2 for replication group mycompany\shares\documents. The partner did not recognize the connection or the replication group configuration.
 
Partner DNS Address: server2.mycompany
 
Optional data if available:
Partner WINS Address: server2
Partner IP Address: x.x.x.x
 
 
The service will retry the connection periodically.
 
Additional Information:
Error: 9026 (The connection is invalid)
Connection ID: 547F88E8-BBF4-45C3-9B6F-F7CAE91D37C4
Replication Group ID: 9FBCE96A-0E91-438E-94C1-C0EB3F83BF8B
0
 
LVL 1

Author Comment

by:considerscs
ID: 37965322
I did a dfrsdiag pollad on the hub member and now I am back to getting the original error.
0
 
LVL 31

Assisted Solution

by:Rich Weissler
Rich Weissler earned 2000 total points
ID: 37965609
Alrighty then.  I think we can probably eliminate ports and firewalls.
You mentioned you've installed all the hotfixes.  That makes me nervous.  Do you still have a list of the hotfixes which were installed?  (Not critical... probably unrelated to the problem, but could certainly be exasperating the issue.  And I assume the hotfixes weren't applied on ALL the servers... but some... and there are others at your location and another that don't have all the hotfixes installed?)

I assume you aren't encountering any other Active Directory replication issues?

Do you have a single, or multiple AD sites?  (Are the servers across the VPN in another site from the Hub?)

>> But we do have 2008 R2 domain controllers in another company and their 2008 R2 DFS servers are having the same problem.

In the same Site/Domain/Tree/Forest?  Or different?

Just to double check -- for the site in which the DFS Hub Member server resides -- open AD Sites and Servers, select that site, open the NTDS Site Settings, and check the identity of the Inter-Site Topology Generator.  Check that server for any possible communication problems, or errors in it's logs.
0
 
LVL 1

Author Comment

by:considerscs
ID: 37965685
I have gotten it to replicate now for 30 minutes.

RPC locator service was set to manual by the Roles installation.  I set that to automatic and Started the service and it has started to replicate as it should for 30 minutes.

Then it goes back to crashing every minute just a couple minutes ago.

The Hub member is not showing any of the issues on its side now.  Just the spoke member is crashing every minute.
0
 
LVL 1

Author Comment

by:considerscs
ID: 37965729
Do you still have a list of the hotfixes which were installed?

Yes.  At this link http://support.microsoft.com/kb/968429

I did not apply them all to every server.  I was a little reserved at applying to all in case something went wrong due to one of the hotfixes.

No Active Directory replication issues showing in the logs.

I have a single AD site.  All other sites are connected through VPN.  Hub and spoke topology.

The other clients buildings are a different organization that we support.  They have their own VPN to every building and hub and spoke topology.  Their domain sits at the Hub location.

Just to double check -- for the site in which the DFS Hub Member server resides -- open AD Sites and Servers, select that site, open the NTDS Site Settings, and check the identity of the Inter-Site Topology Generator.  Check that server for any possible communication problems, or errors in it's logs.

There is not server listed here for this site.  But this site is the domain site.  The domain server sits on the same vm box as the Hub DFS member.
0
 
LVL 1

Author Comment

by:considerscs
ID: 37966834
I am out of ideas on this.  I am getting this error nearly every 30 seconds now.

This successfully replicated for a little while earlier, but has since went back to crashing.
0
 
LVL 31

Expert Comment

by:Rich Weissler
ID: 37969867
It's a single spoke member that is crashing every 30?  Or are every 2008 DFS server in the environment other than the Hub crashing?

Is it worth attempting to remove the namespace(s) relevant to the crashing server, and recreating them in Windows 2008?
0
 
LVL 1

Author Comment

by:considerscs
ID: 37970113
So far it is just the single spoke member, but it is the only 2008 R2 server in the environment other than the Hub member.

I have removed the namespace a couple different times and even uninstalled DFS on the trouble box and reinstalled, but still nothing.

I have never seen something like this continue to happen even after all the troubleshooting.
0
 
LVL 1

Accepted Solution

by:
considerscs earned 0 total points
ID: 37981430
I have solved this issue.

Turns out, the ISP had to come install some hardware to resolve time out issues.  They upgraded our speed a few weeks ago, and we have not been getting near that, plus a lot of network latency.

The timeouts were almost dead on with the errors.  Once they installed the new hardware, DFS finished replicating as it should, and has not thrown anymore errors.
0
 
LVL 1

Author Closing Comment

by:considerscs
ID: 37996136
All of the responses here were vital to help pinpoint the issue.
0

Featured Post

Free Tool: Port Scanner

Check which ports are open to the outside world. Helps make sure that your firewall rules are working as intended.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This article is in regards to the Cisco QSFP-4SFP10G-CU1M cables, which are designed to uplink/downlink 40GB ports to 10GB SFP ports. I recently experienced this and found very little configuration documentation on how these are supposed to be confi…
Considering cloud tradeoffs and determining the right mix for your organization.
This tutorial will show how to configure a single USB drive with a separate folder for each day of the week. This will allow each of the backups to be kept separate preventing the previous day’s backup from being overwritten. The USB drive must be s…
There are cases when e.g. an IT administrator wants to have full access and view into selected mailboxes on Exchange server, directly from his own email account in Outlook or Outlook Web Access. This proves useful when for example administrator want…

578 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question