Solved

File Replication is failing between domain controllers

Posted on 2011-09-22
16
349 Views
Last Modified: 2012-05-12
I just found out that my FRS between my DC's is failing every night.  See attached.  Where do I begin to fix this?
file-replication.txt
0
Comment
Question by:jrsitman
  • 8
  • 5
  • 2
  • +1
16 Comments
 
LVL 10

Expert Comment

by:abhijitwaikar
ID: 36583477
There could be many reasons for the File Replication Service the experience problems replicating.

check this: http://www.eventid.net/display.asp?eventid=13508&eventno=349&source=ntfrs&phase=1
0
 
LVL 9

Expert Comment

by:Lester_Clayton
ID: 36583519
Seems like you have DNS issues.  Can you just verify that both of your domain controllers can talk to a valid DNS server, and that this DNS server is a Domain Controller in the same domain?

Do some NSLOOKUP tests on both servers, to ensure it can resolve the other server.

FRS needs to do SRV record lookups, and if you're all using a third party DNS server, it's not going to work.
0
 
LVL 41

Expert Comment

by:Amit
ID: 36583525
Do you see FRS event ID 13509, if you see this ID that means don't need to worry else stop the file replication service and start it again. Also run the repadmin /replsum and check result.
0
 

Author Comment

by:jrsitman
ID: 36583547
@Lester Clayton.  I'm not a DNS expert or even close.  How do I test if they can talk to each other.  Details please.  Both my DC's are DNS servers.
Anything else you want me to do, please send detailed steps.  I'd really appreciate it.
0
 
LVL 10

Expert Comment

by:abhijitwaikar
ID: 36583567
Just run dcdiag /q , netdiag /q, repadmin /replsum and ipconfig /all on both server and post the result.

Also check if there is any 13568 error event on any one of the server.

0
 
LVL 9

Assisted Solution

by:Lester_Clayton
Lester_Clayton earned 200 total points
ID: 36583588

First Test:


Run a command prompt, and at the command prompt type the following

NSLOOKUP laspca.corp

It should reply with a list of IP's - these should be domain controllers

Second Test:


Now try the following:

NSLOOKUP SPCALA16.laspca.corp

and

NSLOOKUP SPCALA20.laspca.corp

Do all the above tests from both Domain Controllers.

Expected Responses:


This is what a good response for the first test looks like:

C:\Users\lclayton>nslookup mgmt.local
Server:  mgmt01.mgmt.local
Address:  10.110.176.11

Name:    mgmt.local
Addresses:  10.110.176.12
          10.110.176.11

This is what a good response for the second tests looks like:


C:\Users\lclayton>nslookup mgmt01.mgmt.local
Server:  mgmt01.mgmt.local
Address:  10.110.176.11

Name:    mgmt01.mgmt.local
Address:  10.110.176.11

This is what a bad response looks like:

C:\Users\lclayton>nslookup mgmt31.mgmt.local
Server:  mgmt01.mgmt.local
Address:  10.110.176.11

*** mgmt01.mgmt.local can't find mgmt31.mgmt.local: Non-existent domain
0
 

Author Comment

by:jrsitman
ID: 36583682
Here are the results.  the first 4 are from the 2003 DC, the 5-7 are the 2008 DC.  On the 2008 netdiag stated it was an invalid command.

No 13568 on 2003 but yes on 2008

frs1.png
frs2.png
frs3.png
frs4.png
frs16a.png
frs16b.png
frs16c.png
0
 

Author Comment

by:jrsitman
ID: 36583745
Here are the NSlookup results.  Not good.  The first 3 are from the 2008, next are 2003 server
nslookup.png
0
 
LVL 10

Accepted Solution

by:
abhijitwaikar earned 300 total points
ID: 36583759
I forgot to mention that netdiag is no longer available with 2008, that is ok.

Also configurations and reports of DCDIAG and IPCONFIG are fine.

Now 13568 on 2008 means its replica set is in journal wrap state, to resolve this issue just perform D2, D4.

D4 should be on  healthy DC means 2003 and D2 on 2008 as it has 13568 error. First perform D4 and then D2.

Steps:
D4 also knowas as authorative,
To complete an authoritative restore, stop the FRS service, configure the
BurFlags
registry key, and then restart the FRS service.
To do so:
1.Click Start, and then click Run.
2.In the Open box, type cmd and then press ENTER.
3.In the Command box, type net stop ntfrs.
4.Click Start, and then click Run.
5.In the Open box, type regedit and then press ENTER.
6.Locate the following subkey in the registry:
HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\NtFrs\Parameters\Backup/Restore\Process at Startup
7.In the right pane, double click BurFlags.
8.In the Edit DWORD Value dialog box, type D4 and then click OK.
9.Quit Registry Editor, and then switch to the Command box.
10.In the Command box, type net start ntfrs.
11.Quit the Command box.
When the FRS service is restarted, the following actions occur:
•The value for the BurFlags registry key is set back to 0.
•An event 13566 is logged to signal that an authoritative restore is started.
•Files in the reinitialized FRS replicated directories remain unchanged and become authoritative on direct replication. Additionally, the files become indirect replication partners through transitive replication.
•The FRS database is rebuilt based on current file inventory.
•When the process is complete, an event 13516 is logged to signal that FRS is operational. If the event is not logged, there is a problem with the FRS configuration.


D2 knows as non-authorative:
To perform a nonauthoritative restore, stop the FRS service, configure the
BurFlags
registry key, and then restart the FRS service. To do so:
1.Click Start, and then click Run.
2.In the Open box, type cmd and then press ENTER.
3.In the Command box, type net stop ntfrs.
4.Click Start, and then click Run.
5.In the Open box, type regedit and then press ENTER.
6.Locate the following subkey in the registry:
HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\NtFrs\Parameters\Backup/Restore\Process at Startup
7.In the right pane, double-click BurFlags.
8.In the Edit DWORD Value dialog box, type D2 and then click OK.
9.Quit Registry Editor, and then switch to the Command box.
10.In the Command box, type net start ntfrs.
11.Quit the Command box.
When the FRS service restarts, the following actions occur:
•The value for BurFlags registry key returns to 0.
•Files in the reinitialized FRS folders are moved to a Pre-existing folder.
•An event 13565 is logged to signal that a nonauthoritative restore is started.
•The FRS database is rebuilt.
•The member performs an initial join of the replica set from an upstream partner or from the computer that is specified in the Replica Set Parent registry key if a parent has been specified for SYSVOL replica sets.
•The reinitialized computer runs a full replication of the affected replica sets when the relevant replication schedule begins.
•When the process is complete, an event 13516 is logged to signal that FRS is operational. If the event is not logged, there is a problem with the FRS configuration.

If you are unable to understand the steps use below KB for D2,D4 process: http://support.microsoft.com/kb/290762
0
 
LVL 10

Expert Comment

by:abhijitwaikar
ID: 36583771
For safer side, Before performing any provided step please do take a system state or %systemroor%\SYSVOL folder backup.
0
 

Author Comment

by:jrsitman
ID: 36583814
Can I just copy the folder to another location?
0
 

Author Comment

by:jrsitman
ID: 36583981
I'm backing them up with Arcserve
0
 
LVL 10

Expert Comment

by:abhijitwaikar
ID: 36584006
Backing up or copying the folder are the valid options.
0
 

Author Comment

by:jrsitman
ID: 36584034
thanks, I'll do both
0
 

Author Comment

by:jrsitman
ID: 36584486
It failed on the 2008 server bvecause of disk space.  I clear up space and now I'm getting the 13516 event.  How do I test that this is actually fixed?

And thanks for the "simple" instructions
0
 

Author Closing Comment

by:jrsitman
ID: 36584828
Thanks to all.  The program that was getting the FRS is now working so all is good.  I love Experts-Exchange
0

Join & Write a Comment

Installing a printer using group policy preferences is not that hard let’s take a look at it. First lets open up your group policy console and edit the policy you want to add it to. I recommend creating a new policy for each printer makes it a l…
Disabling the Directory Sync Service Account in Office 365 will stop directory synchronization from working.
This tutorial will walk an individual through the steps necessary to join and promote the first Windows Server 2012 domain controller into an Active Directory environment running on Windows Server 2008. Determine the location of the FSMO roles by lo…
This tutorial will walk an individual through the process of configuring their Windows Server 2012 domain controller to synchronize its time with a trusted, external resource. Use Google, Bing, or other preferred search engine to locate trusted NTP …

744 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

15 Experts available now in Live!

Get 1:1 Help Now