• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 2471
  • Last Modified:

DFS File Replication going very slow

This should be very basic but i'm running into an issue...

I have 2 Server 2003 R2 servers (lets call them Server 1 and Server 2) on the same LAN with gigabit connections between them.  I want to replicate 1 folder on Server 1 that contains multiple subfolders with multiple subfolder (etc..) with multiple files to Server 2.  In total, the folder in question consist of:
25 GBs
476,700 files
447,000 folders
Server 2 started with no data in its folder share.  I went into DFS, created a domain root and added the folder (25 GB of data) on Server 1 as a root target and then added the empty folder on Server 2 as a root target.  I went into configure replication and chose the share on Server 1 to be the initial master and set it to be Hub and spoke topology with Server 1 as the Hub.  I made the replication schedule to be available at all times.  
I actually set all this up this past Sunday and not until Tuesday morning did I see contents in the Server 2 folder (not nearly all the items that should be replicated).  Yesterday I realized I needed to take action b/c something wasn't right.  I went into the Publish tab of the Properties of the root and it was not being published in AD, so i published it and made myself the owner.  I do have domain admin rights in this domain.
Today the share on Server 2 still has only
6.5 GB of data
86,618 files
79,430 folders
which is only about 25% of the data.

My questions to you are:
Should it really be taking this long to do the initial replication?
Is there anything I did wrong or missed in the setup?
  • 7
  • 3
  • 2
  • +3
6 Solutions
25GB of data should have taken just 2-3 hours to do the initial replication.
Have you done any performance monitoring on the source & destination servers?
Disk I/O can be a major bottleneck with this kind of thing.
Where are you putting your DFS staging files? If possible, put them on a seperate physical disk for best performance.
What about CPU?
Do you have monitoring software that will allow you to see if your NW switch is healthy?
How many second-level folders do you have? If your top level folder has a small number, e.g. 10 subfolders, then consider setting up replication groups at that level, rather than a single RG for everything.
This will also help to isolate if there is a specific dataset that is replicating slowly due to high I/O or corrupt data.
sliknick1028Author Commented:
Yeah, good point Chev_PCN.  I checked Server 1's system monitor and the Avg Disk Queue length was pegged at the top at 100 in the graph pretty much at all times.  The avg was about 7.9, Min - 1.75, Max - 30.8.
Server 1 has 2 physical disks with 2 volumes - C and D.  The C drive is mirrored across both physical disks and the D drive is spanned across the 2 physical disks to give it more available space.
The target share folder is on the D drive and I did put the DFS staging files on the D drive as well.  I did this b/c the C drive doesn't have as much free space as the D drive.  The C drive still has 34 GB free so as long as the staging folder doesn't take up that much space i'll switch it if you think that would help.
What actually gets stored in the staging folder?
sliknick1028Author Commented:
CPU usage is steady between 10 - 20%.
I checked and the switch traffic is fine, running at gigabit speed.
To clarify more on my last post... on Server 1 the Average Disk Queue Length is averaging around 600 with occasional spikes up to 2200.  Server 2's Avg Disk Queue Length is averaging around 30 and the CPU and memory is low (Server 2 is a few years newer therefore has much better specs on the CPU, more memory, and faster disks).

I have 10 subfolders within the main folder so I deleted the original root and will create 10 different DFS roots.  I started with 3 smaller subfolders (each no bigger then 500 MBs) and the initial replication finished completely for all of those after a couple hours.  I also changed the location of the staging folders for each so that they're stored on the C drive while the data is stored on the D drive, and created seperate staging folders for each root.
I then moved onto the largest folder (11 GB).  I started this Saturday morning and right now at 2 PM on Sunday there is about 900 MB of the 11 GB replicated over to the destination share on Server 2.  This seems to be the folder thats holding everything up.
Do you think increasing the size of the staging folder from its default of 4GBs would help?  Do you think the cause of the slowness is b/c the D drive on Server 1 is spanned across 2 physical disks??
Anymore suggestions that I could try?
Problems using Powershell and Active Directory?

Managing Active Directory does not always have to be complicated.  If you are spending more time trying instead of doing, then it's time to look at something else. For nearly 20 years, AD admins around the world have used one tool for day-to-day AD management: Hyena. Discover why

Andrew OakeleyConsultantCommented:
Increasing the staging quota will improve performance (properties of each membership in DFSR console) if the staging quota is too small there will be excessive disk activity as initial replication is performed. If you have enough disk space spare increase the staging quota to the same size as the replicated data. Once initial replication is complete decrease it to the default 4Gb and it will clean itself up.

What kind of drives?

I have not done DFS replication (I'm a network guy, not a server guy), but with any file "copying" its not as much the volume of data, but the number of files.  Copying 1,000 files of 1K each will take a longer than copying one 1 MB file.

Do you have jumbo frames enabled?  This would speed up the network side of things if it were the bottle neck.

The slow point should be the network.   If you are truly going at 1 Gbps, that is about 125MBps.  At 125MBps, it would take just a few minutes to tranfer 25GB of data.  Even it were a lot of little files, it should not take that long to move them.

I beleive that you can throttle DFS replication, by chance have throttle it a bit too much?
I think the fact that you have 2 physical drives with both your C mirrored across them, and your data drive spanned across them is probably the #1 source of your grief.
I would recommend that you put in (as a minimum) another 2 drives mirrored, which you should then dedicate to the data drive. Preferable would be 3 drives @ RAID5 & make sure they are on a seperate controller if possible.

As a test, if you use robocopy to copy your problem folder across to server B, how long does it take and does it show any particular errors?
you have a HUGE disk subsystem performance problem.
You should have Average Disk Queue Length no more than 1.5-2 times the number of disks. In your case that should be 4 I/O requests. 30 I/Os are too much, not to mention 600! Are you sure that this are correct numbers?! Are this numbers also so big without replication running?
I think that you pretend too much from your servers and I guess that they are quite old.
For giving you some recommendations about disk configuration you should tell me more what are you using this servers for (except file sharing).
What about disk fregmentation?
If I did the math right, you have 79,430 directories averaging 1.09 files per directory and an average file size of 78 Kb. That's really hard on a file system to scan through all of the directories, find all of the files, make a database of everything, and then start the replication. If you just did a standard file copy or robocopy that would take a really long time too because of the large number of directories and the large number of small files. Even if all of the files were pre-seeded I expect that it would take a long time to do the initial replication because it needs to catalog every file and compare them to the source/target.

Aside from the problems in your disk subsystem that have already been pointed out, the only thing you can change is the disk staging quotas if the DFS Replication diagnotic reports says the the staging area has been getting purged. Moving them to a different physical drive will help with disk contention.

DFS namespaces and DFS replication groups are different and they can be configured independantly. If everything is under a single share you can point to that share using a single leaf in a DFS tree. You can concieveably have 10 different links for each of the 10 next level folders, but there is no reason to have multiple DFS roots. I have a single DFS root for my enterprise consisting of 13 servers, a few hundred links and about 40 replication groups.
sliknick1028Author Commented:
I figured it out... I didn't have the actual DFS replication service installed.  I was using the DFS that comes out of the box for Server 2003 R2.  After I installed all the additional DFS tools and services the replication was much faster.  There were 2 folders about 10 GBs each and those still took about 24 hrs to complete the initial replication but that was much better then 6 GB after 3 days.  So I'm assuming it was using the older file replication service before which is totally different from DFS replication.

Thanks for your help guys, gotta figure out point allocation.
I am glad you figured that out, because it would take a long time for us experts to figure out that you didn't have it installed.
sliknick1028Author Commented:
I would think it would be 1 of the 1st questions to ask, if u were aware of the significant improvement between the 2 technologies and the fact that someone could easily start using the out of the box DFS not knowing there were additional DFS ools and services available.
sliknick1028Author Commented:
Well, I not so sure about that.  There is no "DFS" out of the box.   Windows 2003 has two type of replication services: File Replication Services (FRS) and then DFS replication.

You stated you were setting up DFS replication, not FRS.  

Why do you think we would know you were using function "A", but calling it by "B's" name?
sliknick1028Author Commented:
Here you are giltjr... Pitures speek a thousand words...

There IS DFS out of the box (Picture 1) which uses FRS for its replicating.  This could very easily be misinterpreted as DFS replication by someone just getting into it because it is DFS and it is replicating, eh?

Not until you install the 3 extra tools and services for DFS (Picture 2) do you get the DFS Management option available under Administrative Tools (Picture 3) which when setup will use DFS replication instead of FRS to do the replicating.

You even said yourself that you are a network guy and not a server guy.  I appreciate you trying though.
Bad wording on my part, but it just goes to show you how bad and confusing terms are in this wonderful world of computers and then there is the whole English and its confusing grammar (believe it or not English is my first and only language).

Of course there is DFS out of the box.

What I meant to say (and did not) is there is no DFS replication services out of the box.  There is file replication services (FRS) , which can be used to replicate files that reside in a DFS.  Which is different from DFS replication services.

When I read DFS file replication, I assumed (yes I know what that means) DFS replication services, not FRS being used on files within a DFS.

My apologies.

However, even with that, 10GB in 24 hours still seems a bit slow to me, but it sure beats what you were getting.

sliknick1028Author Commented:
The 3 additional DFS tools and services needed to be installed on the server (Server 2003 R2) in order for the replicating to use DFS replication and not FRS (much slower).
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Featured Post

Simplify Active Directory Administration

Administration of Active Directory does not have to be hard.  Too often what should be a simple task is made more difficult than it needs to be.The solution?  Hyena from SystemTools Software.  With ease-of-use as well as powerful importing and bulk updating capabilities.

  • 7
  • 3
  • 2
  • +3
Tackle projects and never again get stuck behind a technical roadblock.
Join Now