Link to home
Start Free TrialLog in
Avatar of ITStaffLMT
ITStaffLMT

asked on

DFSR Communication Error on server that's been removed from DFS

I had 3 servers in the mix for DFS Replication (windows 2003 R2). One of the servers needed to be rebuilt(SERVER1), so once all the files were safely in DFS and copied onto other servers I removed all the pointers in DFS to that server and turned off the shares. All was good from the user perspective, however I'm getting errors in DFS on the other servers that are still trying to access it. The server is still up and running, no changes have been made, but I want to rebuild this guy soon.

The error is showing up in the Health Report as :
" Communication errors are preventing replication with partner SERVER1.  
  Affected replicated folders: All replicated folders on this server.
  Description: DFS Replication cannot replicate with partner SERVER1 due to a communication error. This error can occur if the host is unreachable, or if the DFS Replication service is not running on the server. The DFS Replication service used partner DNS name SERVER1.lmtmercer.lmtproducts.com, IP address 192.168.3.22, and WINS address SERVER1 but failed with error ID: 1722 (The RPC server is unavailable.). Event ID: 5008
  Last occurred: Friday, May 22, 2009 at 8:46:27 AM (GMT-5:00)
  Suggested action: Check for network connectivity and service related problems. For troubleshooting RPC issues see RPC KB 839880 and for additional troubleshooting information, see The Microsoft Web Site. "


In event Viewer I get the following error for each DFS Share in replication:
Event Type:      Error
Event Source:      DFSR
Event Category:      None
Event ID:      5008
Date:            5/22/2009
Time:            9:19:43 AM
User:            N/A
Computer:      SERVER2 Description:
The DFS Replication service failed to communicate with partner SERVER1 for replication group dfs\share. This error can occur if the host is unreachable, or if the DFS Replication service is not running on the server.
 
Partner DNS Address: SERVER1.lmtmercer.lmtproducts.com
 
Optional data if available:
Partner WINS Address: SERVER1
Partner IP Address: 192.168.3.22
 
The service will retry the connection periodically.
 
Additional Information:
Error: 1722 (The RPC server is unavailable.)
Connection ID: 2A1D5FAE-E35E-41AD-BBA6-6DA504F1EE43
Replication Group ID: A8BABD57-0435-4F70-93FC-0979DD06B4C4

For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.


I've looked in adsiedit.msc and do not find anything under DFS for any server with the SIDs listed in the errors. And the server that's been removed has no information in adsiedit.msc under DFS.
Avatar of a_ro_no
a_ro_no
Flag of Greece image

You have to remove the DFSR information from active directory not the DFS.
The best way is to remove the server from the replication group using the DFSR Management Console.
Just to be sure...
Please use the new DFS Console which comes with R2 in order to be able to manage DFSR
Avatar of ITStaffLMT
ITStaffLMT

ASKER

Yes I am using the DFS management console that's part of R2. And I removed the server in all the places it was.

The point is that I've done that and yet I keep getting errors that the other servers are trying to connect to it.
DFSR replication polls its information from Active Directory... maybe your AD replication between the domain controllers does not work correctly.
ASKER CERTIFIED SOLUTION
Avatar of ChiefIT
ChiefIT
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
a ro no:
OK, Active Directory replication appears to be working correctly and a check using dcdiag /c on the server with the error confirms that all test pass.

CheifIT:
The server that was removed from replication was not a domain controller; therefore there was no metadata to clean. This was just a member server that we enabled DFSR to move the shares it had on it to other servers to allow me to rebuild it.

**I did have an issue when I first created DFS and tried to remove a server in DFS Management from the namespace and asked it to remove the replication as well. I traced that error message back to a microsoft patch for that problem. since that time I have deleted the replicated folders first and then deleted the folder target.

****the same basic issue is still recurring. I removed all replicated folders and folder targets from the first server, yet ONE of the remaing servers is still trying to contact that server for DFS replication only...
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
I think AD sites and services>>default first site>>replication partners carries the AD information to replicate from one server to another. If you remove it from ADS&S, it shouldn't try to replicate its shares back and forth.
sorry, i had a server crash yesterday and just got it rebuilt... I'll try these things either this weekend or monday and let you know where they stand.
Chief, I don't see "AD sites and services>>default first site>>replication partners" listed. are you talking about the NTDS settings? The server that is listed in the errors is only a member server, not a DC.

A Ro No. I'm leaning toward the local DFSR DB issue, because it's only one server that keeps trying to communicate with the server we removed. That's the server that we I had to run the following hotfix (http://support.microsoft.com/kb/953527) about deleting a namespace and replicas and causing orphaned objects...

as for the LDIFDE util I'm not quite sure how to use it and what commands/switches to run. In the meantime I've been looking at the DFSRDIAG util and again am having trouble finding details about using that util.

Also, in the C:\system volume information\DFSR\Config folder I've looked thru each of the xml config files and there is no mention of the "deleted fileserver" in any of them. so I'm assuming these xml files are what DFSR uses to know what it's supposed to do; assuming they're not telling it to do anything with the removed fileserver, then what is?

make sense?
OK, i finally got up my VMware test domain and deleted the DFSR DB to see if/how it rebuilt, all went well with the test.

I then did this on the server that's been throwing the errors about the deleted server and it appears that everything is working correctly now. It's in the process of checking replication on all the shared folders, so it will be a while before I know for sure -- keep your finger's crossed.
It turned out to be the DFSR database. Once I deleted the database on the server giving the errors, the errors went away and all replication has been good for 24 hours.

thanx for all your help and suggestions leading me to this.