Solved

DFSR is not replicating SOME replication folders

Posted on 2013-01-19
6
2,332 Views
Last Modified: 2013-02-18
I have two file servers, which we'll call fs-01 and fs-02. They have one replication group, which we'll call RG-01. Within that replication group are 25 sets of replicated folders. Its clear (by manually looking) that replication is not occurring in 3 of those 25 folder sets. Heres a dump of my troubleshooting so far

The servers are virtual running Windows 2008 R2 in a ESX5.1 environment. Underlying storage is iSCSI SAN. The servers are at 2 different locations (HQ and Hotsite) connected by a 1GB P2P network. The network is trunked across the P2P, though fs-02 is on a different subnet and is defined as such in sites and services.
I inherited this system about 5 days ago. Two weeks prior to that, a network architecture change (the trunking) caused some instabilities in the underlying SAN storage system. By all accounts those have been resolved now.
When I run the "Create Diagnostic Report" and create a propogation test and subsequent report the test files at 22 of the 25 folders are replicated nearly instantaneously (< 1 sec)
I ran the same propogation test in the three affected folders, and 3 days later they still show as "Incomplete tests" - which I take to mean they havent replicated.
If I run a dfsrdiag backlog /Rmem:FS-02 /Smem:FS-01 /RGName:RG-01 /RFName:"Folder name" from the command prompt it returns "No Backlog - member FS-02 is in sync with partner FS-01. Operation Succeeded. It even says this despite the fact that diagnostic test files still show as not yet synced in the report.
These replicated folders are GIGANTIC - 3.4 TB, 8 TB, and 1 TB with individual file sizes sometimes as large as 80 GB, but with (relatively speaking) few files in each individual subfolder - . The staging quote for these 3 volumes is set at 750 GB. All the other replicated folders have a staging quote of 10 GB.
I am ASSUMING that the files in FS-02 were in fact replicated, and not seeded, and that replication worked once upon a time for these folders. That said, I DO notice that these 3 folders all have special characters - ( and ) to be precise - in them.
There are no errors in the application or system logs on either server for DFS-Svc, DFS Replication, DFSR, or DFSR Audit. There are a handful of info notices for routine things like "DFS Server has finished initializing"
Ive looked at the logs in C:\windows\debug, and there are a LOT of them there, but nothing really sticks out as an error.
Anyone have any thoughts, or additional diagnostic tests I can run?
0
Comment
Question by:Eric_Price
  • 3
  • 2
6 Comments
 
LVL 37

Accepted Solution

by:
ArneLovius earned 400 total points
ID: 38797510
special characters are not an issue, same goes for files and folders with a space at the end of the name (mac users...), then you have the joy of fixing them in two locations....

If you had "instabilities in the underlying SAN storage system", i'd check for disk corruption, however with the volume sizes that you have posted, I'd test by copying the files to "something else" or "somewhere else" locally at each site, using something that logs the copy output, possibly robocopy ?

You could try removing one of the folders from the replication group, refreshing both ends, clearing out staging etc and then adding it back agaiin.

I am presuming that you don't have iSCSI going over the link, just the DFS replication.

With the value of the storage for ~13TB 'm going to guess that the value of the data is not low, have you considered opening a Microsoft PSS case ?
0
 
LVL 1

Author Comment

by:Eric_Price
ID: 38798468
It may quickly come to opening a case. Im a week on the job here and to be honest my past experience has been with one way replication and the old FRS. The system seems pretty straight forward, and m not averse to tinkering either, so long as I have good backups and Ive taken the time to ask for quick assistance either. No sense reinventing the wheel.

It is just the DFS replication.

Those are a couple of great suggestions. I think I'll try one of them today (Sunday) and then based on the results decide on Monday whether to move forward or call Microsoft. Since this replication is "only" to our hotsite, I dont feel QUITE the pressure I think I would if it were something the more directly affected day to day operations, but I still hate the thought of letting it go very long at all. Its almost like an invitation for disaster. lol
0
 
LVL 1

Author Comment

by:Eric_Price
ID: 38799256
To add some additional information, the most critical folder (labeled "Analysis (Current)" shows no backlog on fs-01, but on fs-02 (the hot site server) dfsr diag reports that there is a backlog of over 900,000 files in that folder. Its clearly been successful replicating some over the past couple of weeks, because those conflicts on fs-01 end up in the conflicts/deleted folder.

The files on fs-01 are the only ones that will ever be modified. There the ones I want to keep at all costs.

PS - I have no good backup, since my predecessor has all the backups going over the P2P link (Appassure). It worked great as long as it was happy making incremental backups, but now it seems to want a new base, and as you can imagine one cant simply make a new base of 14 TB over a 1GB P2P link.

Given the sensitivity, a call to MS is in order methinks. And a stiff drink and a backup.
0
PRTG Network Monitor: Intuitive Network Monitoring

Network Monitoring is essential to ensure that computer systems and network devices are running. Use PRTG to monitor LANs, servers, websites, applications and devices, bandwidth, virtual environments, remote systems, IoT, and many more. PRTG is easy to set up & use.

 
LVL 37

Assisted Solution

by:ArneLovius
ArneLovius earned 400 total points
ID: 38799674
Nothing wrong with incremental backups that become synthetic full backups (such as Microsoft DPM, or even rsync with hardlinks to existing files, or snapshots) , just incremental backups though is a different kettle of fish...

Presuming you have at least 10Gb Ethernet at each site, I'd be very tempted to order first thing on Monday a box that can take a "quantity" of "inexpensive" disks and setup a backup server at the remote site that you can start ASAP and then bring back to the main site/redeploy when complete...

If you're just backing up files, then using rsync to a *nix box using zfs to do daily snapshots can be an inexpensive simple solution, robocopy in threaded mode can be faster in a LAN environment if you're just comparing modification times, but robocopy only copies whole files, so if you have large files that only have small changes, rsync can be significantly faster...

Anyway, call Microsoft PSS,  pay the ~£200 and open a case, they are open 24/7
0
 
LVL 26

Assisted Solution

by:Pber
Pber earned 100 total points
ID: 38803252
It sucks the logs are helping you out.

Have you seen this article: http://blogs.technet.com/b/askds/archive/2011/07/13/how-to-determine-the-minimum-staging-area-dfsr-needs-for-a-replicated-folder.aspx

Is your staging area larger than 32 of the largest files in the folders in question?
0
 
LVL 1

Author Closing Comment

by:Eric_Price
ID: 38903144
Im closing it out and spreading the wealth on the points. Thanks for the assist guys.
0

Featured Post

NAS Cloud Backup Strategies

This article explains backup scenarios when using network storage. We review the so-called “3-2-1 strategy” and summarize the methods you can use to send NAS data to the cloud

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
SBS 2008 Standard OEL 2 28
Interactive Script in Scheduled Task not running 8 30
RDS2012 vs RDS2008 4 38
Admin account lockout 10 39
The Delta outage: 650 cancelled flights, more than 1200 delayed flights, thousands of frustrated customers, tens of millions of dollars in damages – plus untold reputational damage to one of the world’s most trusted airlines. All due to a catastroph…
A procedure for exporting installed hotfix details of remote computers using powershell
This tutorial will give a short introduction and overview of Backup Exec 2012 and how to navigate and perform basic functions. Click on the Backup Exec button in the upper left corner. From here, are global settings for the application such as conne…
This tutorial will walk an individual through configuring a drive on a Windows Server 2008 to perform shadow copies in order to quickly recover deleted files and folders. Click on Start and then select Computer to view the available drives on the se…

776 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question