[Okta Webinar] Learn how to a build a cloud-first strategyRegister Now

x
?
Solved

dfs replication event 4104 repeating

Posted on 2012-09-18
18
Medium Priority
?
3,417 Views
Last Modified: 2012-10-14
Hi,

I have a DFS replication problem between two Windows 2008 R2 SP1 servers. One of the server (server A, domain controller) is hosting replicated files from at least 8 servers across a WAN without any issues but in the case of one of the servers (server B, domain member), there seems to be a serious problem.

On server B, I have created a namespace to send four folders to the server A. Initially, I used Robocopy for two of the folders to avoid replicating 60 GB over the Internet.

At first, the replication started and then after a few hours to a few days (depending on the size of the folders), I got a message with Event ID 4104 The "DFS Replication service successfully finished initial replication on the replicated folder at local path E:\folder".

Then, for a reason that I don't understand, I started getting event IDs 4202, "The DFS Replication service has detected that the staging space in use for the replicated folder at local path E:\folder is above the high watermark. The service will attempt to delete the oldest staging files. Performance may be affected". and then event IDs 4204 "The DFS Replication service has successfully deleted old staging files for the replicated folder at local path E:\chateauroyal\shared. The staging space is now below the high watermark".

And then, I receivied again an event ID 4104 for the same folder "DFS Replication service successfully finished initial replication on the replicated folder at local path E:\folder" even if that folder had already been initially synchronized.

So it seems that initial replication is always restarting just about every few days. This is not normal and this is the only server exhibiting this behaviour. And worse, it seems to be eating up more and more space in sysvol folder of the affected drive.

In an attempt to fix the issue, I tried deleting the namespace and folders, clean up all the hidden "DFS private" folders and even cleaned the System Volume Information folder that was filled with some staging files up to the point that the hard drive was close to run out of disk space. Then I recreated the namespace and folders and again, the problem is still present. I even removed and reinstalled DFS on server B.

If I test the replication itself, it seems to be OK, meaning that folders and files are moving through the folders but I am afraid that my hard drive will fill itself again from pre-staging files.

As for Server A, I am only getting replication errors when Windows Server Backup is running (all my servers are backed up at 23:30 every night) and after the backups are done, all folders are reconnecting normally. I may also have a few disconnections once in a while when the VPN slows down or disconnects for a few seconds.

I am really out of ideas (short of calling Microsoft). Anyone can help?

Thanks.

Benji
0
Comment
Question by:benjilafouine
  • 11
  • 7
18 Comments
 
LVL 35

Expert Comment

by:Bembi
ID: 38415736
The staging folder default DfsrPrivate\Staging is a cache folder and the size has to estimated by the largest files in the repolacated folders, the connection speed and the amount of data.
It is defined for every replicated folder on each DFS machine.
Each file, which is changed will be stored there and will stay there to compare it with changes from other servers.
The general rule is the minimum staging size should be the double size of the largest replicated file. To enhance performance you can raise the size, recommendation is
on member server size of the 4 largest files...
on hub server (if existant = a central server) size of the 16 largest files.
For the case, that all files are changing very often, you will find all of them in the staging folders earlier or later.

The error message just says, that the staging contingent to store files is just at the end and that older files has to be deleted to free up space. That means, if the file is changed and replicated again, the staging file has to be created again and this may cost performance.

If you have space, raise the staging space for the affected folders. Especially you should do this, if you can see the events several times per hour.
As long as you don't see real replication errors, you may ignore the messages.
You may also have a look at the folder dates in the staging folder.
As closer the oldest folder is to the current date, as more critical you should observe replication.

But at the end, it is more a performance question. After a file, the staging folder will get full one day, if you have more files than the staging folder can host.  

Why is your sysvol folder filling up?
Are there stored files too.
0
 
LVL 1

Author Comment

by:benjilafouine
ID: 38415891
Yes, I was very surprised to see a dfs folder in the sysvol folder (something like 30 GB worth of it filling the hard drive). As for my staging quotas, they are well above any file size in these folders (like three to four times the size of the biggest file).

What puzzles me the most is event 4104 that is always repeated. This is not normal: initial replication should occur only once as it is the case for my 12 other servers doing replication: event 4104 only apperas once and never repeat itself over and over every second day. I have replicated folders that are in place for months (if not years) and never 4104 shows up twice for the same folder.

Tis is a very abnormal situation. I know that in Server B, I have some issues with two software locking files when they are opened but otherwise, I can't see what's wrong.
0
 
LVL 35

Expert Comment

by:Bembi
ID: 38416004
First at all, you may not put the staging folder into the sysvol share. Staging folder can be put on any drive. Just change the settings in the replication settings for the folders.

4104 should not come several times, you are right.

Have you tried to exclude the files, which are blocked from this server, just to see, if this is the issue?

Have you tried to activate volume shadow copy?

When you recreated the DFS on ServerB, are all the DfsarPrivate folder deleted too?
Are files in the PreExisting folder (if existant) ?

And see this:
http://blogs.technet.com/b/askds/archive/2007/10/05/top-10-common-causes-of-slow-replication-with-dfsr.aspx

Some common mistakes with DFS and tests possibilities...
0
Microsoft Certification Exam 74-409

Veeam® is happy to provide the Microsoft community with a study guide prepared by MVP and MCT, Orin Thomas. This guide will take you through each of the exam objectives, helping you to prepare for and pass the examination.

 
LVL 1

Author Comment

by:benjilafouine
ID: 38416028
By sysvol, I mean "System Volume Information" folder which is present on each logical drive (but hidden). I found DFS files in there without any intervention from my part. The DFS folder is only present in this folder if there are some replicated folders in this particular drive. Staging files are located inside the replicated folder (and hidden) as the same as for all my other servers.

I haven't excluded files that are locked but that would be a logical step. However, these files get replicated once the user closes the software at the end of the day (I validated that).

I did not activated Shadow copies (disabled on all logical drives).

When I reinitialized the folders, I made sure that I had deleted all previous staging folders (this is when I also found some stuff in the system volume information because the drive used space math wouldn't add up.

Anything I am forgetting? As I say, I have more than 12 other servers replicating without issues. Weird.
0
 
LVL 35

Expert Comment

by:Bembi
ID: 38416058
OK, understand.
The SVI folder stores i.e. indexed files and folders. So if indexing is enable on the DFS shared folders, they will be added to the SVI folder. And system restore points are in there. You can give yourself (admin) prmissions to look inside.

I haven't excluded files....
Just an idea, why DFS feels this way...

Volume Shadow may help for blocked files if enabled. Nevertheless they take additional space.

> When I reinitialized the folders...
OK.
As you said, that the SVI folder was growing and you have setup DFS again, you may stop the indexing service and empty the indexed content. Maybe there are fragemnts left over.
Also I guess it is a good idea, if you use indexing, to leave the hidden DFS folder out of the index.

Check the article an have a look into the reports, maybe there is a hint for the problem.
Right click the replication group, there are the reports and some basic test.
0
 
LVL 1

Author Comment

by:benjilafouine
ID: 38416067
What I don't understand is that all my other replicated folders on other servers are configured just the same way (and I mean exactly the same).

Aside from the locked files which is a unique problem on this server.
0
 
LVL 35

Expert Comment

by:Bembi
ID: 38416101
This point me to this fact.

This event also can occur (as I could see), if the DFS is not capable to really replicate all content within the schedules time scope, what can happen over slow WAN connections. For this, remote differential compression is helpful, but should be disabled on fast lines (as it produced heavily server load).

That means, that this can be the second difference. So in the DFS Report, there is an information what is in the backlog. Maybe check it regularly to see, if it runs full, especially between the replication intervals.

Also a point of view are programs, which write a lot of temporary files. We had an issue some time ago with lotus notes folders (java), they killed completely replication, as they produced hundreds of small short living files, so replication run out of sync.

Just some additional ideas.
0
 
LVL 1

Author Comment

by:benjilafouine
ID: 38416195
I will rerun a report tomorrow to check again.
0
 
LVL 1

Author Comment

by:benjilafouine
ID: 38418602
This is the only warning I get in the healt report (otherwise, replication is instantly performed when I create a file):

Pre-existing content is not replicated and is consuming disk space.  
  Affected replicated folders: folder_name
  Description: During the initial replication process for replicated folder chateauroyal_home, the DFS Replication service identified pre-existing local content that was not present on the primary member and moved the content to D:\folder\home\DfsrPrivate\PreExisting. The DfsrPrivate\Preexisting folder is a hidden system folder that is located under the local path of the replicated folder. Content in the DfsrPrivate\PreExisting folder will not be replicated to other members of the replication group, nor will the content be deleted by the DFS Replication service during any automatic clean-up.  
  Last occurred: Thursday, September 20, 2012 at 12:18:10 PM (GMT-5:00)
  Suggested action: If you want this content to be replicated to other members, move the content into the replicated folder outside of the DfsrPrivate folder. If you want to reclaim this disk space, delete the pre-existing content in the PreExisting folder.  

I am not surprised to see that message because each time the replication repeats over and over again, these folders are filling fast (including the "system volume information" folder which has a dfrs folder with 12.3 GB already and filling fast).

I have never seen that problem on any of my other servers. I am clueless. I will deactivate the two replicated folders that have sharing violation errors to see if it makes a difference on the other replicated folders.
0
 
LVL 1

Author Comment

by:benjilafouine
ID: 38419264
I compared all my servers for a difference between them regarding replication and the only notable difference is that on server B, I was a replicated folder on the c:\program files folder, which is really not the good place to store data (but I have a software that does that). I have disabled replication on this folder and will monitor results in the next few days.

I will also try to find a way to move the data out of that folder since it is silly to put some data there (but this software does that).
0
 
LVL 35

Expert Comment

by:Bembi
ID: 38419807
OK, and I would try to get the ExistingContent folder empty....

I inspected my SVI folder on my DFS hosts, beside the database, there is also a private folder there with some Conflict and deleted files and folders. And as I can see, I found also folders, which came from older DFS replication folders. Theres is one GUID for each folder. The GUIDs are also stored in AD under DFSR-GlobalSettings under Content.
0
 
LVL 1

Author Comment

by:benjilafouine
ID: 38419978
I am not going to touch anything for a few days because I still have two active folders replicating. I will know fast enough if a replicated folder on drive C and/or sharing violations are causing the issue.
0
 
LVL 35

Expert Comment

by:Bembi
ID: 38421097
Yes sure, just some additional information around....
0
 
LVL 1

Author Comment

by:benjilafouine
ID: 38421712
Look at this: event 2107. "The DFS Replication detected the ASR instance key has changed on volume E:. Replication has been stopped for all replicated folders on this volume".

Then: event 2108. "The DFS Replication service successfully recovered from the ASR instance key change on volume E:. Replication has resumed on replicated folders on this volume".

Then: event 4102. "The DFS Replication service initialized the replicated folder at local path E:\chateauroyal\home and is waiting to perform initial replication. The replicated folder will remain in this state until it has received replicated data, directly or indirectly, from the designated primary member"

Then a few hours later, event 4104 (initial replication complete).

In my view, events 2107 and 2108 must be the cause of the problem. And by the way, the problem seems to be happening when Windows Server Backup (full backup) is kicking in.
0
 
LVL 1

Author Comment

by:benjilafouine
ID: 38422963
Check this out:

http://social.technet.microsoft.com/Forums/en-CA/winserverfiles/thread/47582917-9086-46a3-8fd1-5440e0a9c10f

Exactly my problem. I deleted the keys and I will know after a full backup completes if problem is fixed. I think I did performed a restore some months ago because of a malfunctioning database (replicated).
0
 
LVL 35

Accepted Solution

by:
Bembi earned 2000 total points
ID: 38430383
OK, we hope the best ;-) Come back if you have new information.
0
 
LVL 1

Author Comment

by:benjilafouine
ID: 38445595
Issue is fixed. Think I am going to give you the points.
0
 
LVL 1

Author Closing Comment

by:benjilafouine
ID: 38494944
See my fix but Bembi was of great help so I am assigning the points to him.
0

Featured Post

How to Use the Help Bell

Need to boost the visibility of your question for solutions? Use the Experts Exchange Help Bell to confirm priority levels and contact subject-matter experts for question attention.  Check out this how-to article for more information.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

A procedure for exporting installed hotfix details of remote computers using powershell
For anyone that has accidentally used newSID with Server 2008 R2 (like I did) and hasn't been able to get the server running again because you were unlucky (as I was) and had no backups - I was able to get things working by doing a Registry Hive rec…
To efficiently enable the rotation of USB drives for backups, storage pools need to be created. This way no matter which USB drive is installed, the backups will successfully write without any administrative intervention. Multiple USB devices need t…
This tutorial will walk an individual through setting the global and backup job media overwrite and protection periods in Backup Exec 2012. Log onto the Backup Exec Central Administration Server. Examine the services. If all or most of them are stop…
Suggested Courses

872 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question