DFSR - The DFS Replication service failed to communicate with partner

We have recently removed our last W2K3 R2 DC and now have four W2K8 DCs over two sites. The domain is still Running 2003 Forest and domains levels.

We use DFSR between two sites with two root servers file1 and file2 both running W2K8. The namespace type is 'Domain (Windows 2000 Server mode)'. DFS was orginally running on Windows 2003 R2. I have moved to the new hosting servers running W2K8 R2 and updated the root servers. Everything has been working fine for weeks up until the removal of the last W2K3 DC yesterday which also had WINS. I had updated the network with WINS at both sites on two of the four new DC's (one at each site) but since removing WINS from the old W2K3 DC I am now getting event errors on file2 at site 2 (only) and replication is failing

=============================================================
Log Name:      DFS Replication
Source:        DFSR
Date:          03/05/2012 13:12:24
Event ID:      5008
Task Category: None
Level:         Error
Keywords:      Classic
User:          N/A
Computer:      FILE2.domain.local
Description:
The DFS Replication service failed to communicate with partner FILE2 for replication group kgfruits.local\dfs\wares data. This error can occur if the host is unreachable, or if the DFS Replication service is not running on the server.
 
Partner DNS Address: FILE2.<domain>
 
Optional data if available:
Partner WINS Address: FILE2
Partner IP Address: IP
 
The service will retry the connection periodically.
 
Additional Information:
Error: 1722 (The RPC server is unavailable.)
Connection ID: 130A1D31-33DD-4115-85E4-43684AB5C093
Replication Group ID: 8F7A93B0-55AA-4C26-858B-A7994584FA47
Event Xml:
<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
  <System>
    <Provider Name="DFSR" />
    <EventID Qualifiers="49152">5008</EventID>
    <Level>2</Level>
    <Task>0</Task>
    <Keywords>0x80000000000000</Keywords>
    <TimeCreated SystemTime="2012-05-03T12:12:24.000000000Z" />
    <EventRecordID>3108</EventRecordID>
    <Channel>DFS Replication</Channel>
    <Computer>FILE2.domain.local</Computer>
    <Security />
  </System>
  <EventData>
    <Data>130A1D31-33DD-4115-85E4-43684AB5C093</Data>
    <Data>FILE2</Data>
    <Data>DFS DATA PATH
    <Data>FILE2.domain.local
    <Data>FILE2</Data>
    <Data>IP</Data>
    <Data>1722</Data>
    <Data>The RPC server is unavailable.</Data>
    <Data>8F7A93B0-55AA-4C26-858B-A7994584FA47</Data>
  </EventData>
</Event>

===========================================================

* DFSDIAG test all look fine
(http://blogs.technet.com/b/josebda/archive/2009/07/15/five-ways-to-check-your-dfs-namespaces-dfs-n-configuration-with-the-dfsdiag-exe-tool.aspx)

* Verifys topology successfully in DFS Mgmt.

* Diagnositic report from FILE1 shows FILE2 unavailable for reporting with 'Cannot connect to reporting DCOM server - The RPC server is unavailable. ' and sends you here http://support.microsoft.com/Default.aspx?kbid=839880 

*Diagnostic reports from FILE2 just hangs (problem server).

*DCDIAG /V - everything is working fine on all four DC's
*DCDIAG /TEST:DNS - everything is working fine on all four DC's

* Have performed NBTStat -AA on all servers to refresh the WINS db

Still not working :(

Suppose I could migrate (http://technet.microsoft.com/en-us/library/cc753875.aspx) but a pain as it causes a big problem and seems a bit drastic!

 thanks ;)
BerryGardensAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Prashant GirennavarCommented:
RPC Server unavaialbe is due to network port blockage or due to DNS Misconfiguration.

Make sure local windows firewall is off and try to temprorely disable Antivirus and check

additionally ,

Please check the below link which helps you to understand neccessary ports required for replication over Firewall.

http://social.technet.microsoft.com/wiki/contents/articles/active-directory-replication-over-firewalls.aspx

http://blogs.technet.com/b/janelewis/archive/2006/11/13/ports-used-in-active-directory-replication.aspx

Additionally you can use PortQry tool to check the firewall ports. You can download it from the below link.

http://www.microsoft.com/download/en/details.aspx?id=17148

Using PortQry for Troubleshooting.

http://blogs.technet.com/b/askds/archive/2009/01/22/using-portqry-for-troubleshooting.aspx

Also is your DNS is in Place? Check for DNS misconfiguration.

Refer below article to understand this.

http://support.microsoft.com/kb/321046

Please refer below link which discuss the same dilemma
http://www.experts-exchange.com/OS/Microsoft_Operating_Systems/Server/2003_Server/Q_22492765.html#a18853724

Hope this information helps

Regards,

_Prashant_
BerryGardensAuthor Commented:
I have done this prior to posting which enables port 135 - http://technet.microsoft.com/en-us/library/cc774368(WS.10).aspx but no change so I switched off the Firewall for the past few hours on both file servers and DC's but again no change.

I haven't tried disabling AV - will give that a go!

DNS is in place and working fine. No errors reported. If I flush the DNS and displaydns for the new records everything looks correct. Even the WINS records are correct.

Report back shortly. thanks
BerryGardensAuthor Commented:
Disabled AV but no change after stopping and starting DFS replication - same events being logged.

I performed a proagation test/report as it's only effecting 8 of 11 DFS replication groups. Here is the outcome...

Error: Member FILE2.domain : Cannot read values from WMI provider on the member. WMI query Select * from DfsrIdRecordInfo where ReplicatedFolderGuid = "F20AD150-1DD3-4AE2-B021-6DD9F540A13E" AND Fid = 2814749767405790 failed. Error code 0x800706ba.

does this shed any light?
BerryGardensAuthor Commented:
Problem resolved - random WINS address in DNS on one of the file servers. Removed and now working fine. Hate WINS!

Time to implement global names and migrate this DFS to 2008!!!

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
BerryGardensAuthor Commented:
Solved issue myself.
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Active Directory

From novice to tech pro — start learning today.