Solved

DFS File Replication going very slow

Posted on 2011-02-16
19
2,315 Views
Last Modified: 2012-05-11
This should be very basic but i'm running into an issue...

I have 2 Server 2003 R2 servers (lets call them Server 1 and Server 2) on the same LAN with gigabit connections between them.  I want to replicate 1 folder on Server 1 that contains multiple subfolders with multiple subfolder (etc..) with multiple files to Server 2.  In total, the folder in question consist of:
25 GBs
476,700 files
447,000 folders
Server 2 started with no data in its folder share.  I went into DFS, created a domain root and added the folder (25 GB of data) on Server 1 as a root target and then added the empty folder on Server 2 as a root target.  I went into configure replication and chose the share on Server 1 to be the initial master and set it to be Hub and spoke topology with Server 1 as the Hub.  I made the replication schedule to be available at all times.  
I actually set all this up this past Sunday and not until Tuesday morning did I see contents in the Server 2 folder (not nearly all the items that should be replicated).  Yesterday I realized I needed to take action b/c something wasn't right.  I went into the Publish tab of the Properties of the root and it was not being published in AD, so i published it and made myself the owner.  I do have domain admin rights in this domain.
Today the share on Server 2 still has only
6.5 GB of data
86,618 files
79,430 folders
which is only about 25% of the data.

My questions to you are:
Should it really be taking this long to do the initial replication?
Is there anything I did wrong or missed in the setup?
Thanks!
0
Comment
Question by:sliknick1028
  • 7
  • 3
  • 2
  • +3
19 Comments
 
LVL 9

Assisted Solution

by:Chev_PCN
Chev_PCN earned 300 total points
ID: 34913512
25GB of data should have taken just 2-3 hours to do the initial replication.
Have you done any performance monitoring on the source & destination servers?
Disk I/O can be a major bottleneck with this kind of thing.
Where are you putting your DFS staging files? If possible, put them on a seperate physical disk for best performance.
What about CPU?
Do you have monitoring software that will allow you to see if your NW switch is healthy?
How many second-level folders do you have? If your top level folder has a small number, e.g. 10 subfolders, then consider setting up replication groups at that level, rather than a single RG for everything.
This will also help to isolate if there is a specific dataset that is replicating slowly due to high I/O or corrupt data.
0
 

Author Comment

by:sliknick1028
ID: 34916345
Yeah, good point Chev_PCN.  I checked Server 1's system monitor and the Avg Disk Queue length was pegged at the top at 100 in the graph pretty much at all times.  The avg was about 7.9, Min - 1.75, Max - 30.8.
Server 1 has 2 physical disks with 2 volumes - C and D.  The C drive is mirrored across both physical disks and the D drive is spanned across the 2 physical disks to give it more available space.
The target share folder is on the D drive and I did put the DFS staging files on the D drive as well.  I did this b/c the C drive doesn't have as much free space as the D drive.  The C drive still has 34 GB free so as long as the staging folder doesn't take up that much space i'll switch it if you think that would help.
What actually gets stored in the staging folder?
0
 

Author Comment

by:sliknick1028
ID: 34938434
CPU usage is steady between 10 - 20%.
I checked and the switch traffic is fine, running at gigabit speed.
To clarify more on my last post... on Server 1 the Average Disk Queue Length is averaging around 600 with occasional spikes up to 2200.  Server 2's Avg Disk Queue Length is averaging around 30 and the CPU and memory is low (Server 2 is a few years newer therefore has much better specs on the CPU, more memory, and faster disks).

I have 10 subfolders within the main folder so I deleted the original root and will create 10 different DFS roots.  I started with 3 smaller subfolders (each no bigger then 500 MBs) and the initial replication finished completely for all of those after a couple hours.  I also changed the location of the staging folders for each so that they're stored on the C drive while the data is stored on the D drive, and created seperate staging folders for each root.
I then moved onto the largest folder (11 GB).  I started this Saturday morning and right now at 2 PM on Sunday there is about 900 MB of the 11 GB replicated over to the destination share on Server 2.  This seems to be the folder thats holding everything up.
Do you think increasing the size of the staging folder from its default of 4GBs would help?  Do you think the cause of the slowness is b/c the D drive on Server 1 is spanned across 2 physical disks??
Anymore suggestions that I could try?
0
 
LVL 17

Expert Comment

by:aoakeley
ID: 34940060
Increasing the staging quota will improve performance (properties of each membership in DFSR console) if the staging quota is too small there will be excessive disk activity as initial replication is performed. If you have enough disk space spare increase the staging quota to the same size as the replicated data. Once initial replication is complete decrease it to the default 4Gb and it will clean itself up.

0
 
LVL 17

Assisted Solution

by:aoakeley
aoakeley earned 100 total points
ID: 34940076
0
 
LVL 57

Expert Comment

by:giltjr
ID: 34940097
What kind of drives?

I have not done DFS replication (I'm a network guy, not a server guy), but with any file "copying" its not as much the volume of data, but the number of files.  Copying 1,000 files of 1K each will take a longer than copying one 1 MB file.

Do you have jumbo frames enabled?  This would speed up the network side of things if it were the bottle neck.

The slow point should be the network.   If you are truly going at 1 Gbps, that is about 125MBps.  At 125MBps, it would take just a few minutes to tranfer 25GB of data.  Even it were a lot of little files, it should not take that long to move them.

I beleive that you can throttle DFS replication, by chance have throttle it a bit too much?
0
 
LVL 9

Assisted Solution

by:Chev_PCN
Chev_PCN earned 300 total points
ID: 34940658
I think the fact that you have 2 physical drives with both your C mirrored across them, and your data drive spanned across them is probably the #1 source of your grief.
I would recommend that you put in (as a minimum) another 2 drives mirrored, which you should then dedicate to the data drive. Preferable would be 3 drives @ RAID5 & make sure they are on a seperate controller if possible.

As a test, if you use robocopy to copy your problem folder across to server B, how long does it take and does it show any particular errors?
0
 
LVL 27

Assisted Solution

by:davorin
davorin earned 50 total points
ID: 34941755
Hi,
you have a HUGE disk subsystem performance problem.
You should have Average Disk Queue Length no more than 1.5-2 times the number of disks. In your case that should be 4 I/O requests. 30 I/Os are too much, not to mention 600! Are you sure that this are correct numbers?! Are this numbers also so big without replication running?
I think that you pretend too much from your servers and I guess that they are quite old.
For giving you some recommendations about disk configuration you should tell me more what are you using this servers for (except file sharing).
http://www.windowsnetworking.com/articles_tutorials/Windows-Server-2003-Performance-Tuning.html
What about disk fregmentation?
0
Do email signature updates give you a headache?

Constantly trying to correctly format email signatures? Spending all of your time at every user’s desk to make updates? Want high-quality HTML signatures on all devices, including on mobiles and Macs? Then, let Exclaimer solve all your email signature problems today!

 
LVL 42

Assisted Solution

by:kevinhsieh
kevinhsieh earned 50 total points
ID: 34942071
If I did the math right, you have 79,430 directories averaging 1.09 files per directory and an average file size of 78 Kb. That's really hard on a file system to scan through all of the directories, find all of the files, make a database of everything, and then start the replication. If you just did a standard file copy or robocopy that would take a really long time too because of the large number of directories and the large number of small files. Even if all of the files were pre-seeded I expect that it would take a long time to do the initial replication because it needs to catalog every file and compare them to the source/target.

Aside from the problems in your disk subsystem that have already been pointed out, the only thing you can change is the disk staging quotas if the DFS Replication diagnotic reports says the the staging area has been getting purged. Moving them to a different physical drive will help with disk contention.

DFS namespaces and DFS replication groups are different and they can be configured independantly. If everything is under a single share you can point to that share using a single leaf in a DFS tree. You can concieveably have 10 different links for each of the 10 next level folders, but there is no reason to have multiple DFS roots. I have a single DFS root for my enterprise consisting of 13 servers, a few hundred links and about 40 replication groups.
0
 

Accepted Solution

by:
sliknick1028 earned 0 total points
ID: 34972131
I figured it out... I didn't have the actual DFS replication service installed.  I was using the DFS that comes out of the box for Server 2003 R2.  After I installed all the additional DFS tools and services the replication was much faster.  There were 2 folders about 10 GBs each and those still took about 24 hrs to complete the initial replication but that was much better then 6 GB after 3 days.  So I'm assuming it was using the older file replication service before which is totally different from DFS replication.

Thanks for your help guys, gotta figure out point allocation.
0
 
LVL 42

Expert Comment

by:kevinhsieh
ID: 34977707
I am glad you figured that out, because it would take a long time for us experts to figure out that you didn't have it installed.
0
 

Author Comment

by:sliknick1028
ID: 34985384
I would think it would be 1 of the 1st questions to ask, if u were aware of the significant improvement between the 2 technologies and the fact that someone could easily start using the out of the box DFS not knowing there were additional DFS ools and services available.
0
 

Author Comment

by:sliknick1028
ID: 34985386
tools*
0
 
LVL 57

Expert Comment

by:giltjr
ID: 34985520
Well, I not so sure about that.  There is no "DFS" out of the box.   Windows 2003 has two type of replication services: File Replication Services (FRS) and then DFS replication.

You stated you were setting up DFS replication, not FRS.  

Why do you think we would know you were using function "A", but calling it by "B's" name?
0
 

Author Comment

by:sliknick1028
ID: 35029373
Here you are giltjr... Pitures speek a thousand words...

There IS DFS out of the box (Picture 1) which uses FRS for its replicating.  This could very easily be misinterpreted as DFS replication by someone just getting into it because it is DFS and it is replicating, eh?

Not until you install the 3 extra tools and services for DFS (Picture 2) do you get the DFS Management option available under Administrative Tools (Picture 3) which when setup will use DFS replication instead of FRS to do the replicating.

You even said yourself that you are a network guy and not a server guy.  I appreciate you trying though.
DFS-OutOfBox.png
DFS-ExtraToolsServices.png
DFS-ExtraToolsServicesInstalled.png
0
 
LVL 57

Expert Comment

by:giltjr
ID: 35029511
Bad wording on my part, but it just goes to show you how bad and confusing terms are in this wonderful world of computers and then there is the whole English and its confusing grammar (believe it or not English is my first and only language).

Of course there is DFS out of the box.

What I meant to say (and did not) is there is no DFS replication services out of the box.  There is file replication services (FRS) , which can be used to replicate files that reside in a DFS.  Which is different from DFS replication services.

When I read DFS file replication, I assumed (yes I know what that means) DFS replication services, not FRS being used on files within a DFS.

My apologies.

However, even with that, 10GB in 24 hours still seems a bit slow to me, but it sure beats what you were getting.



 
0
 

Author Closing Comment

by:sliknick1028
ID: 35067625
The 3 additional DFS tools and services needed to be installed on the server (Server 2003 R2) in order for the replicating to use DFS replication and not FRS (much slower).
0

Featured Post

Threat Intelligence Starter Resources

Integrating threat intelligence can be challenging, and not all companies are ready. These resources can help you build awareness and prepare for defense.

Join & Write a Comment

#Citrix #Citrix Netscaler #HTTP Compression #Load Balance
If your business is like most, chances are you still need to maintain a fax infrastructure for your staff. It’s hard to believe that a communication technology that was thriving in the mid-80s could still be an essential part of your team’s modern I…
This tutorial will walk an individual through the process of transferring the five major, necessary Active Directory Roles, commonly referred to as the FSMO roles from a Windows Server 2008 domain controller to a Windows Server 2012 domain controlle…
Here's a very brief overview of the methods PRTG Network Monitor (https://www.paessler.com/prtg) offers for monitoring bandwidth, to help you decide which methods you´d like to investigate in more detail.  The methods are covered in more detail in o…

707 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

14 Experts available now in Live!

Get 1:1 Help Now