Still celebrating National IT Professionals Day with 3 months of free Premium Membership. Use Code ITDAY17

x
?
Solved

DFSR is not replicating SOME replication folders

Posted on 2013-01-19
6
Medium Priority
?
2,525 Views
Last Modified: 2013-02-18
I have two file servers, which we'll call fs-01 and fs-02. They have one replication group, which we'll call RG-01. Within that replication group are 25 sets of replicated folders. Its clear (by manually looking) that replication is not occurring in 3 of those 25 folder sets. Heres a dump of my troubleshooting so far

The servers are virtual running Windows 2008 R2 in a ESX5.1 environment. Underlying storage is iSCSI SAN. The servers are at 2 different locations (HQ and Hotsite) connected by a 1GB P2P network. The network is trunked across the P2P, though fs-02 is on a different subnet and is defined as such in sites and services.
I inherited this system about 5 days ago. Two weeks prior to that, a network architecture change (the trunking) caused some instabilities in the underlying SAN storage system. By all accounts those have been resolved now.
When I run the "Create Diagnostic Report" and create a propogation test and subsequent report the test files at 22 of the 25 folders are replicated nearly instantaneously (< 1 sec)
I ran the same propogation test in the three affected folders, and 3 days later they still show as "Incomplete tests" - which I take to mean they havent replicated.
If I run a dfsrdiag backlog /Rmem:FS-02 /Smem:FS-01 /RGName:RG-01 /RFName:"Folder name" from the command prompt it returns "No Backlog - member FS-02 is in sync with partner FS-01. Operation Succeeded. It even says this despite the fact that diagnostic test files still show as not yet synced in the report.
These replicated folders are GIGANTIC - 3.4 TB, 8 TB, and 1 TB with individual file sizes sometimes as large as 80 GB, but with (relatively speaking) few files in each individual subfolder - . The staging quote for these 3 volumes is set at 750 GB. All the other replicated folders have a staging quote of 10 GB.
I am ASSUMING that the files in FS-02 were in fact replicated, and not seeded, and that replication worked once upon a time for these folders. That said, I DO notice that these 3 folders all have special characters - ( and ) to be precise - in them.
There are no errors in the application or system logs on either server for DFS-Svc, DFS Replication, DFSR, or DFSR Audit. There are a handful of info notices for routine things like "DFS Server has finished initializing"
Ive looked at the logs in C:\windows\debug, and there are a LOT of them there, but nothing really sticks out as an error.
Anyone have any thoughts, or additional diagnostic tests I can run?
0
Comment
Question by:Eric_Price
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
  • 2
6 Comments
 
LVL 37

Accepted Solution

by:
ArneLovius earned 1600 total points
ID: 38797510
special characters are not an issue, same goes for files and folders with a space at the end of the name (mac users...), then you have the joy of fixing them in two locations....

If you had "instabilities in the underlying SAN storage system", i'd check for disk corruption, however with the volume sizes that you have posted, I'd test by copying the files to "something else" or "somewhere else" locally at each site, using something that logs the copy output, possibly robocopy ?

You could try removing one of the folders from the replication group, refreshing both ends, clearing out staging etc and then adding it back agaiin.

I am presuming that you don't have iSCSI going over the link, just the DFS replication.

With the value of the storage for ~13TB 'm going to guess that the value of the data is not low, have you considered opening a Microsoft PSS case ?
0
 
LVL 1

Author Comment

by:Eric_Price
ID: 38798468
It may quickly come to opening a case. Im a week on the job here and to be honest my past experience has been with one way replication and the old FRS. The system seems pretty straight forward, and m not averse to tinkering either, so long as I have good backups and Ive taken the time to ask for quick assistance either. No sense reinventing the wheel.

It is just the DFS replication.

Those are a couple of great suggestions. I think I'll try one of them today (Sunday) and then based on the results decide on Monday whether to move forward or call Microsoft. Since this replication is "only" to our hotsite, I dont feel QUITE the pressure I think I would if it were something the more directly affected day to day operations, but I still hate the thought of letting it go very long at all. Its almost like an invitation for disaster. lol
0
 
LVL 1

Author Comment

by:Eric_Price
ID: 38799256
To add some additional information, the most critical folder (labeled "Analysis (Current)" shows no backlog on fs-01, but on fs-02 (the hot site server) dfsr diag reports that there is a backlog of over 900,000 files in that folder. Its clearly been successful replicating some over the past couple of weeks, because those conflicts on fs-01 end up in the conflicts/deleted folder.

The files on fs-01 are the only ones that will ever be modified. There the ones I want to keep at all costs.

PS - I have no good backup, since my predecessor has all the backups going over the P2P link (Appassure). It worked great as long as it was happy making incremental backups, but now it seems to want a new base, and as you can imagine one cant simply make a new base of 14 TB over a 1GB P2P link.

Given the sensitivity, a call to MS is in order methinks. And a stiff drink and a backup.
0
Concerto Cloud for Software Providers & ISVs

Can Concerto Cloud Services help you focus on evolving your application offerings, while delivering the best cloud experience to your customers? From DevOps to revenue models and customer support, the answer is yes!

Learn how Concerto can help you.

 
LVL 37

Assisted Solution

by:ArneLovius
ArneLovius earned 1600 total points
ID: 38799674
Nothing wrong with incremental backups that become synthetic full backups (such as Microsoft DPM, or even rsync with hardlinks to existing files, or snapshots) , just incremental backups though is a different kettle of fish...

Presuming you have at least 10Gb Ethernet at each site, I'd be very tempted to order first thing on Monday a box that can take a "quantity" of "inexpensive" disks and setup a backup server at the remote site that you can start ASAP and then bring back to the main site/redeploy when complete...

If you're just backing up files, then using rsync to a *nix box using zfs to do daily snapshots can be an inexpensive simple solution, robocopy in threaded mode can be faster in a LAN environment if you're just comparing modification times, but robocopy only copies whole files, so if you have large files that only have small changes, rsync can be significantly faster...

Anyway, call Microsoft PSS,  pay the ~£200 and open a case, they are open 24/7
0
 
LVL 26

Assisted Solution

by:Pber
Pber earned 400 total points
ID: 38803252
It sucks the logs are helping you out.

Have you seen this article: http://blogs.technet.com/b/askds/archive/2011/07/13/how-to-determine-the-minimum-staging-area-dfsr-needs-for-a-replicated-folder.aspx

Is your staging area larger than 32 of the largest files in the folders in question?
0
 
LVL 1

Author Closing Comment

by:Eric_Price
ID: 38903144
Im closing it out and spreading the wealth on the points. Thanks for the assist guys.
0

Featured Post

On Demand Webinar - Networking for the Cloud Era

This webinar discusses:
-Common barriers companies experience when moving to the cloud
-How SD-WAN changes the way we look at networks
-Best practices customers should employ moving forward with cloud migration
-What happens behind the scenes of SteelConnect’s one-click button

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

For anyone that has accidentally used newSID with Server 2008 R2 (like I did) and hasn't been able to get the server running again because you were unlucky (as I was) and had no backups - I was able to get things working by doing a Registry Hive rec…
"Any files you do not have backed up in at least two [other] places are files you do not care about."
This tutorial will give a short introduction and overview of Backup Exec 2012 and how to navigate and perform basic functions. Click on the Backup Exec button in the upper left corner. From here, are global settings for the application such as conne…
Two types of users will appreciate AOMEI Backupper Pro: 1 - Those with PCIe drives (and haven't found cloning software that works on them). 2 - Those who want a fast clone of their boot drive (no re-boots needed) and it can clone your drive wh…

715 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question