Distributed File System Is Stuck

I am running DFS in my environment and I have run up against a problem.  I have four servers: WSCALFS, FARM1, FARM2, and FARM3.  

WSCALFS is not working properly anymore: The DFS Replication service stopped replication on volume G:. This occurs when a DFSR JET database is not shut down cleanly and Auto Recovery is disabled. To resolve this issue, back up the files in the affected replicated folders, and then use the ResumeReplication WMI method to resume replication.

The DFS Replication service stopped replication on volume N:. This occurs when a DFSR JET database is not shut down cleanly and Auto Recovery is disabled. To resolve this issue, back up the files in the affected replicated folders, and then use the ResumeReplication WMI method to resume replication.

The DFS Replication service stopped replication on volume U:. This occurs when a DFSR JET database is not shut down cleanly and Auto Recovery is disabled. To resolve this issue, back up the files in the affected replicated folders, and then use the ResumeReplication WMI method to resume replication.

The biggest problem with this is that some people have been working off WSCALFS and some people have been working off the shared replication server.  

I found an article about recovering from this problem, making the crashed server the primary server, but I am not sure that that is what I want to do.  https://support.microsoft.com/en-us/help/961879/how-to-recover-from-a-dfsr-database-crash-on-designated-primary-member

Any advice?
aclaus225Asked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Philip ElderTechnical Architect - HA/Compute/StorageCommented:
Make sure to back up the servers and their data.

Once that is done, get DFS going on the primary server then connect the rest of the servers.

Now, we use ShadowProtect for our primary backup method. So, once DFSR is up and running and replication shows as complete I would mount the FARM site backup and run a compare between the folder on the server and the backup made prior to getting DFSR going. The utility used for that is BeyondCompare by www.scootersoftware.com.

Set view to Orphan & Newer for the backup side.

If DFSR did its job correctly and left the newer files as authoritative then all is okay. If the newer files were copied over then copy off the backup.
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
kevinhsiehCommented:
You shouldn't let the same file be writeable on different servers. That can lead to version conflicts even under normal circumstances.

What I do is only let one DFS Namespace target be online at a time if multiple users might be writing to an area. That prevents the problem where more than 1 server might have updated files.

You are going to need to do a file audit and see if any files on WSCALFS are newer than the crash that caused DFS Replication to stop. Those files then need to be checked against the other servers. If the other servers have copies that are also newer but different, then changes were made on both sides of the replication and you will have to manually resolve those conflicts. Any files from WSCALFS that are newer than the other servers and are the correct versions should be copied over to the other servers and then restart replication.
0
arnoldCommented:
I'd echo the above. make sure you have backups from all dfs targets/replication members.
When a conflict is detected, the file is stored in the DFSRPrivate/conflictanddeletion folder.
There will be an event in the DFS replication event log indicating that there was a conflict and the file was saved ....

When the replication resumes/resestablishes, DFS-R has a mechanism to (RDC) remote differential compression

if using DFS management, after you install the hotfix/update that supposed to fix this issue, and create the backups on all targets. trigger the sync on the connection of ..
You could if concerned, use DFS management and generate a report using this system as the reference against the others. This way you can identify the files/quantity of files that this one thinks need to be replicated out.
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Windows Server 2012

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.