I have seen many DFS implementations carried out by system admins without much planning very casually and subsequently found themselves caught up with issues.
One of the major issues found is either DFSR folder is not replicating or replicating very slowly with a high backlog. We can avoid known issues in DFSR by following the general best practices.
In this article, I am trying to cover DFS best practises/considerations in general from a deployment and Initial configuration standpoint.
Article applicability: Windows 2008 R2 / 2012 / 2012 R2 / 2016 server
DFS Server Sizing
Servers chosen for DFSR are also known as file servers. Typically, enterprise-class files servers should be properly designed with proper CPU, memory and storage along with a Gigabit network.
Servers can be either physical or virtual. Now a day’s virtual servers are more popular due to easy maintainability.
For a small scale to mid-sized DFSR implementation with a moderate user base and data in-outs:
For large scale file servers which also acting as DFSR servers, proper server sizing needs to be designed based on the user base, data read-write frequency, storage sizing requirement, HA requirements and so on. Microsoft has provided a File server Capacity Tool but it’s complicated and not very practical to use.
For DFSR servers, physical servers give slightly better performance than virtual machines under load conditions since physical servers have dedicated resources and virtual servers are hosted on hypervisors, having other VM workloads, and thus using shared resources.
DFS namespace servers can be installed on low capacity hardware as compared to DFSR servers. Microsoft has not provided any special hardware sizing for namespace servers.
If DFS namespace servers need to be installed on dedicated servers, you can start with virtual servers with 2 Core CPUs, 4 / 8 GB memory and can increase CPU / memory based on actual usage observations / perfmon results.
Before rolling out DFS-N and DFS-R servers, install all latest windows updates along with DFS related hotfixes in advance to avoid any known / unknown issues in advance
Links are provided here
I have not found any lists for 2016 Servers yet.
DFS Name Space Mode
While installing DFS namespace (DFS-N) always select Windows server 2008 mode as the DFS operating mode. This mode enhances scalability and also access-based enumeration (ABE) can be used for DFS targets if enabled.
Active Directory prerequisites for 2008 server mode: Minimum Windows 2003 FFL, Minimum 2008 DFL, 2008 and the above namespace server.
Refer Microsoft document below for additional information
DFS Configuration & Server Placement
DFSR is multi master replication engine. Ideally when deploying DFS Replication:
In a mesh topology:
In hub and spoke topology:
Ideally, when deploying DFS namespace:
Active Directory Health and AD Sites Configuration
Active Directory replication and name resolution must be working properly. DFS-N and DFS-R configuration data is stored under an AD domain partition and replicates among all domain controllers in that domain. The DFS server polls active directory periodically for updates. If any changes are made in DFS configuration on one server and not replicated to other DCs, another DFS server will not get those updates. This is especially true in case of DFS-R members working in remote sites.
If redundant folder targets need to be added to DFS name space folder to cover multiple locations, make sure that each location has a domain controller installed and AD sites and services are configured appropriately with local subnet mapping to localise resource access. Because DFSN locates folder targets in the same AD site as the client no matter how you configure target referral ordering.
If client subnet is mapped to the wrong site or did not get mapped to any site, DFS namespace referrals will provide the wrong target folder location which can create conflicting changes or if read-only replica gets referred as the folder target.
Source data permissions (DFSR)
DFSR is not dependent on file system permissions. By default, the DFSR service runs under the NT Authority\System account, however, if we look through the process explorer, DFSR service also has SeBackupPrivilege and SeRestorePrivilege rights.
Those rights are equivalent to backup and restore files and directories user’s rights. This will enable DFSR to read and copy data from one location and write/paste at the destination location regardless of access rights on files and folders, except for open files.
Refer to the article below for a description of the above rights
If this is a brand new implementation without existing data. The root folder must be set with correct Share and NTFS permissions to avoid access violation issues for users and administrators. Root folder ownership should be granted to the built-in administrators' group. Grant Full control NTFS permissions to System account and built-in administrators group and remove creator owner group from root folder permissions
If you already have existing data (file shares) and already facing access issues, refer to my article below to correct those.
DFSR Staging Quota
DFSR uses staging quota to get files staged, calculate its hash and store it in the DFSR database and then sends files to the replicated member. Default staging quota limit is 4 GB, so it’s good to increase that limit as far as possible to avoid staging loop and to complete initial sync in time.
Later on, we can cut down staging quota limit if wanted to. Refer to the link below to calculate approximate staging quotas for a replicated folder.
We can initially keep it as 20% of total drive capacity, increase if required further by monitoring backlog, this is based on the live environment I had managed previously
Number of replicated folders per drive
You should consider only one replicated folder per drive, this is useful in case you face any issues like heavy backlog or DFSR database corruption and need to clear out a DFSR database from system volume information. In this case, if two replicated folders are on the same drive and the issue exists with a single folder, DFSR Database will be wiped out for both folders and both folders would have to undergo initial replication which can take a considerable amount of time if the data size is large.
DFS-R Data Preseeding (pre-staging) and Cloning
If you are creating replication groups for existing data, or if you are adding a new replicated member to an existing replicated group, and if the data size is bigger (in TBs or in few hundred GBs), an initial sync can take a substantial amount of time to complete, due to the complexities involved in initial sync process.
The initial sync must be completed before data can be replicated back and forth and the process can take many days, depending upon the data size being synced. To save time, we can preseed data followed by DFSR database cloning.
Preseed copy’s data (using the Robocopy tool) from the source server along with NTFS security to the destination server in advance. This process will save time to replicate the data over wire. However, you still need time for staging as at the DFSR stage, every file is copied with a preseeding and generated a hash of the file and store within the database for initial sync. This process is local to DFSR member. To save this staging time, we need to use database cloning further.
Note - The Preseed process uses the Robocopy tool and we need to ensure that we are using the most recent/updated version of Robocopy and ensure the same is installed on the source and destination servers to avoid hash miss matches (Check DFSR hotfixes URLs for Robocopy update)
The cloning process involves DFSR database export (clone) from the primary server and importing on the destination server. When you import a cloned database, DFSR validates hashes in the cloned database with local data (preseeded) on the destination server and once validated, the clone import completes successfully. This will cut down the need for stage and generate a hash for each file on the destination server.
Now when you actually add a member server to a replicated folder, since data along with hash values (database) are already present on both servers, the secondary server will only verify the database hashes with the primary server database hashes as part of the initial sync and once verified, the destination server can replicate data back and forth. The process is almost instantaneous or takes little time.
Refer excellent blog post below how to preseed DFSR data and clone DFSR database with a simple PowerShell interface.
DFSR and roaming profiles
Roaming profiles replication through DFSR should be avoided as far as possible for the reason below.
The DFSR replication process remains slow for roaming profiles as DFSR does not support transactional replication (i.e. replicates all changes at once or replicates nothing). This increases backlog, might choke bandwidth if too many roaming users roaming in and around, so if data is changed by users frequently, it remains open or file handles are not able to be closed.
This leads to issues if a user switches between places and is logging on to a different roaming profile server. It might be possible that the server did not receive all updates from the previous server and the user might overwrite some data or may not get updated data. That type of behaviour complicates the situation more and can even corrupt profiles.
Still, if you want to deploy DFSR for roaming profiles, Microsoft has provided guidelines as below.
If DFSN and DFSR need to be used for a roaming profile to replicate across locations, we need to ensure that users are always connected to a roaming profile share on a single server, no matter if he is roaming in which location to avoid users from making conflicting edits on different servers. This can be achieved by keeping only a single folder target with a DFS namespace link used as a roaming profile path. However, this enforces the load roaming profile over WAN and slows down the loading and logoff process.
Microsoft Support Documents:
Excluding File types from DFSR (Filters)
Microsoft Reference: Removing DFSR Filters
RDC and DFSR
Remote differential compression is the algorithm used by DFSR with 2008 R2 and above servers by default. RDC replicates only delta changes in files in chunks/blocks, thus saving bandwidth, hence it's useful when replicating DFSR over WAN to conserve bandwidth.
However, it causes extra load on the server's CPU because RDC detects insertions, removals, and rearranged data in files, creating delta chunks. Note that RDC feature installation is not required on DFSR servers
We should disable RDC for DFSR groups connected with LAN having gigabit network cards and switches to save processing time and improve replication by replicating entire file in one shot.
More Explanatory Notes:
Disable replicated member vs delete replicated member
If you wanted to remove a member server from a two-way replicated folder, there are two options:
Simply disable the replicated member instead of removing it. This action will stop replication on the server for that specific replicated group. Now at any point if you need to enable replication, you can simply enable it for that server, after enabling, replication will force server to complete initial sync (one-way sync) so that even if you deleted any data from this server, it will get replicated/copied from the other partner as part of the initial sync process and once the initial sync is completed successfully (look for DFSR event ID 4104 on server being altered), replication will start both ways. This way you won’t lose any data.
However, if you deleted a replicated member from the DFSR group, deleting member will not delete the DFSR database on that member. This DFSR database keeps the deleted member for 60 days in Tombstone. If within 60 days you added the deleted server back in a replicated folder, DFSR considers that server as authoritative and start replicating both ways. If you have deleted any data from this member server in the meantime, before adding back to DFSR, the deletion will get replicated to the other server as well and data loss will occur.
Hence just disable member for replicated folder. After disabling replication if you decided to remove server permanently, you can do that any time
DFSR read-only replica
With 2008 R2 Microsoft has innovated Read-only replica. As the name suggests, it is read-only. Meaning replication is one way only, from R/W partner to R/O partner. This is useful when you specifically want to store data at a remote location as a kind of DR.
Once the folder is set as a read-only replica, no one can write/delete any data from read-only folder locally or remotely. R/O replicas are useful in the event your R/W DFSR server along with data failed and you needed to recover data.
We can change the Read-only replica to a read-write replica and vice versa. By doing so, the member being altered undergoes “Initial Sync” or also called as non-authoritative sync. Once the initial sync is completed, DFSR event ID 4104 gets triggered and now the server can either be R/W or R/O replica depending upon the action you have taken.
Microsoft Reference: Read-Only Replication in R2
The best use of R/O replica could be:
DFSR Replication Schedule and Bandwidth
Microsoft has recommended AV exclusions for DFS root shares and its contents (folder targets) etc. These exclusions are same as FRS / DFSR replicated Sysvol. The below article explains exclusions to be placed
DFSN & DFSR Backup and Restore
In order to successfully restore DFS namespace and replication in case of accidental deletion, AD system state and below registries need to be backed up
There are multiple ways available to restore DFS based on what backup you have.
I will discuss the actual restoration in another article.
Microsoft reference: Recovery process of a DFS Namespace in Windows 2003 and 2008 Server
I hope this Article will be helpful in building your DFS server Infrastructure.
If you liked this article, please click the Thumbs-Up icon below.