Link to home
Start Free TrialLog in
Avatar of LeeGolding
LeeGolding

asked on

File Replication Service Journal Wrap Problem

Hi friends,

I am receiving Event ID 13568 on my PDC windows server (pdc.ourdomain.com)
--------------------------------------------------------------------------------------------
The File Replication Service has detected that the replica set "DOMAIN
SYSTEM VOLUME (SYSVOL SHARE)" is in JRNL_WRAP_ERROR
 Replica set name is    : "DOMAIN SYSTEM VOLUME (SYSVOL SHARE)"
 Replica root path is   : "c:\windows\sysvol\domain"
 Replica root volume is : "\\.\C:"
 A Replica set hits JRNL_WRAP_ERROR when the record that it is trying to read from the NTFS USN journal is not found.  This can occur because of one of the following reasons.
 [1] Volume "\\.\C:" has been formatted.
 [2] The NTFS USN journal on volume "\\.\C:" has been deleted.
 [3] The NTFS USN journal on volume "\\.\C:" has been truncated. Chkdsk can truncate the journal if it finds corrupt entries at the end of the journal.
 [4] File Replication Service was not running on this computer for a long time.
 [5] File Replication Service could not keep up with the rate of Disk IO activity on "\\.\C:".
 Setting the "Enable Journal Wrap Automatic Restore" registry parameter to 1 will cause the following recovery steps to be taken to automatically recover from this error state.
 [1] At the first poll, which will occur in 5 minutes, this computer will be deleted from the replica set. If you do not want to wait 5 minutes, then run "net stop ntfrs" followed by "net start ntfrs" to restart the File Replication Service.
 [2] At the poll following the deletion this computer will be re-added to the replica set. The re-addition will trigger a full tree sync for the replica set.
WARNING: During the recovery process data in the replica tree may be unavailable. You should reset the registry parameter described above to 0 to prevent automatic recovery from making the data unexpectedly unavailable if this error condition occurs again.
--------------------------------------------------------------------------------------------

I tried the solution of:

Expand HKEY_LOCAL_MACHINE.
Click down the key path:
   "System\CurrentControlSet\Services\NtFrs\Parameters"
Double click on the value name
   "Enable Journal Wrap Automatic Restore"
and update the value to 1.

I then got a message in the event logs saying the computer was removed from the replica set. Unfortunately, we had an unforeseen replication problem on the BDC. I am now getting a message in the event logs of both my PDC and BDC saying it can't replicate to each other, and thus, the SYSVOL share won't mount on either server :-((  Here is the error:

-------------------------------------------------------------------------------------------------------------------------------------------------------------
File Replication Service is having trouble enabling replication from \\bdc.ourdomain.com to PDC for c:\windows\sysvol\domain using the DNS name \\bdc.ourdomain.com. FRS will keep retrying.
 Following are some of the reasons you would see this warning.
 [1] FRS can not correctly resolve the DNS name \\bdc.ourdomain.com from this computer.
 [2] FRS is not running on \\bdc.ourdomain.com
 [3] The topology information in the Active Directory for this replica has not yet replicated to all the Domain Controllers.
-------------------------------------------------------------------------------------------------------------------------------------------------------------

A very big problem as the PDC won't act as a domain controller and global catalogue is down and Exchange won't start, and no-one can access shares across the network :-(

The files in the SYSVOL folder seem to have been moved to the 'NtFrs_PreExisting___See_EventLog'. I'm hoping that I haven't lost the SYSVOL DATA!

1) How can I get this all fixed? Move the files/folders in the aforementioned folder to the SYSVOL folder? Or demote and promote the BDC again?

2) Is there a temporary fix to get the SYSVOL shared again on at least the PDC so that AD works properly and the PDC becomes a domain controller? Just to get the users access to it's shares, Exchange, etc.

Very urgent so max points!

Thanks in advance,

Lee.
Avatar of Netman66
Netman66
Flag of Canada image

Be patient.

Did you reboot both DCs when you changed that registry key?

It now must resync everything so don't change a bunch of stuff or you'll end up waiting longer.

Avatar of LeeGolding
LeeGolding

ASKER

Rebooted both DCs. Left them for a few hours too and they still wouldn't become domain controllers :-(

Lee.
Now getting a new error:

-------------------------------------------------------------------------------------
Event Type:      Warning
Event Source:      NtFrs
Event Category:      None
Event ID:      13525
Date:            08/11/2006
Time:            08:06:12
User:            N/A
Computer:      PDC
Description:
The File Replication Service cannot find the DNS name for the computer BDC because the "dNSHostName" attribute could not be read from the distinguished name "cn=swyx,ou=domain controllers,dc=mercianlabels,dc=com".
The File Replication Service will try using the name "BDC" until the computer's DNS name appears.
-------------------------------------------------------------------------------------

Its like there is a DNS problem. But the host record for the BDC machine is clearly in the DNS on the PDC. I can ping the BDC by IP or host name.

But I can't access the PDC from the BDC machine.

Could this be the problem?

Lee.
We can afford to take the BDC down and just use the PDC for the time being. But need to get SYSVOL started on the PDC so the staff can work!

Lee.
Getting this on the BDC:

---------------------------------------------------------------
Event Type:      Error
Event Source:      NETLOGON
Event Category:      None
Event ID:      3210
Date:            08/11/2006
Time:            08:17:52
User:            N/A
Computer:      SWYX
Description:
Failed to authenticate with \\PDC, a Windows NT or Windows 2000 domain controller for domain OURDOMAIN.
Data:
0000: 22 00 00 c0               "..À    
---------------------------------------------------------------





Lee.
Is this a single label domain name?

If so, then you need to tweak DNS to accept dynamic updates (which should allow correct registration) and this may solve the issue.

http://support.microsoft.com/kb/300684/en-us
The domain is called 'ourdomain.com' if that's what you mean? No more domains on the system.

Lee.
Then it appears that it's not a single label. However, the servers have worked perfectly for over a year until lately.

Lee.
How do I tweak my DNS to accept dynamic updates?

Ta,

Lee.
Follow this article.  Use the main server as the master copy.  Note - there are no such things as PDC and BDC any longer.  All servers are now peers.  The "main" server I refer to is the one holding the FSMO roles.

http://support.microsoft.com/kb/290762/en-us

Attempt a non-authoritative restore by setting the registry key on the second DC only to D2.  Be sure to stop the FRS service on both DCs before you do this, then start the service first on the main server and then 2 minutes later on the secondary server.  Be patient.  Clear all logs before you start so you can see what's going on.

If this fails, then do the Authoritative restore - which is similar to above except you now set the registry to D4 on the main server as well as D2 on the secondary server.

Read the article completely to understand what it's telling you BEFORE you start.

Let us know.
Re: the DNS question.

On the main DNS server expand then right-click each zone.  
Select Properties.
Make sure that the zone is Active Directory integrated, accepts Secure Dynamic Updates and has the right replication scope.

The _msdcs.domain.com zone replicates to all DNS servers in the FOREST.
The rest should be set to replicate to all DNS servers in the DOMAIN.

If I follow the non-authorative restore, will the replication that is failing between the servers cause the restore not to work?

Or it is worth a shot anyway?

Thanks,

Lee.
This article tells you how to fix that.  SYSVOL replication depends on FRS.  We're attempting to fix FRS.

DO NOT copy files in the staging area manually - let the system do the work.

As I said in the above, just in case you've overlooked it but I doubt it, the files in the SYSVOL folder seem to have been moved to the 'NtFrs_PreExisting___See_EventLog' on PDC and on the BDC, both servers in fact.

I haven't done an authorative/non on either yet. Just the Journal Wrap fix.

Does this matter?

Thanks :-)

Lee.
This is the default behaviour when Windows attempts to fix FRS.  It moves contents to preexisitng before it rebuilds the staging area then replays information back from those new folders.

Ok I've done the D2 restore as you instructed. Rebooted the servers. Its been 2 hours now and I'm still getting errors on both servers such as:

The File Replication Service is having trouble enabling replication from
SWYX to SERVER2 for c:\winnt\sysvol\domain using the DNS name
swyx.mercianlabels.com. FRS will keep retrying.
Following are some of the reasons you would see this warning.

This hasn't improved things. What else can I try? Demote BDC, clean up any entries for BDC in AD on PDC?

Really I don't care about the server BDC, I just want the SYSVOL to mount so that Active Directory works fine and my users can access files, share, etc on server PDC.

Lee.
It won't mount until the Journal Wrap is corrected.

Make sure all servers point to one DNS server only (until we fix this).
Make sure zones are AD Integrated and Accept Dynamic Updates.
Absolutely NO ISP DNS settings on any NIC inside your firewall.

Restart the Netlogon service on each server and also run IPCONFIG /registerdns then confirm there are entries for the servers.

See if the errors change.

Let me know.
Almost all of the above are definitely true. I've checked them again also.

There are entries for the servers in DNS.

One strange thing is that the IP address of the server BDC is coming up in server PDC's DNS as an A record. So that seems to conflict with the correct A record for PDC.

PDC has IP of 192.168.0.9
BDC has IP of 192.168.0.4

In DNS they both are A Name records.

Even though there is also a record for 192.168.0.4 for the computer BDC.

I was getting a netlogon error on the BDC which I can't remember off hand.

Lee.
Each server should have an A record and a GUID record associated with them in the forward lookup zone.  If they are DCs, then they should also share a common entry (same as parent) with each IP - this represents the domain.

Not sure what you mean by conflict, but you can safely delete just the records from the forward lookup zone and then register with DNS as I described above.  If your DNS is working properly all the records should return correctly.

In the forward lookup zone I have something like right at the top of the zone:

Parent Host 192.168.0.9
Parent Host 192.168.0.4
PDC Host 192.168.0.9
BDC Host 192.168.0.4

I'm not onsite so can't get the details :-( Gutted.

But its something like that. It was saying from my point of view that PDC has 2 IP addresses 192.168.0.9 and 192.168.0.4 even though it hasn't!

Lee.
Same as parent for both as well as A records for both is correct.

They're both DCs, so they will have the same as parent entry besides their normal A record.

There should also be a GUID.domain.com entry for each server also in the FLZ.


The DNS looks fine in that case. The NICs DNS address on both servers point to 192.168.0.9. All looks good really.

Shall I stop and restart NETLOGON on both servers and post the error messages here?

I think the last error message on BDC said that there are no domain controllers to process a logon request.

Thanks,

Lee.
You could restart the Netlogon services to be certain.

Normally, the error you mention is a direct result of DNS not answering domain queries properly - let's see.

Will do first thing tomorrow. In my humble opinion, I don't think the server BDC is communcating properly with PDC. I can't access shares on server PDC from server BDC.

Thanks so far,

Lee.
Are they on the same physical wire?  Is the firewall on BDC turned on?

No firewall on BDC. They are on the same router. I can ping each server from either server no problem. Just get problems trying to windows explore to \\PDC for example from server BDC.

Lee.
I can't remember if DNS was installed on BDC also, and something has happened and its been lost. I didn't setup the server BDC.

Lee.
No big deal if only PDC has DNS.  That's trivial right now.

Is File and Print Sharing checked on the NICs?
Ok.

Yes enabled on both and looks fine.

Lee.
ASKER CERTIFIED SOLUTION
Avatar of Netman66
Netman66
Flag of Canada image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Non-authorative reset has been done and to no avail. Do I try an authorative? Or shall I post a netdiag -v here?

Lee.
Yes.

If this fails, I need to see more.

Ok. Authorative has allowed SYSVOL to be mounted.

However.... the c:\winnt\sysvol\domain folder is now empty. But users can log in fine now, Exchange will start.

I'm worred that when I reboot SYSVOL will be empty and thing fall apart.

I have backed up the content of this folder just in case on the Desktop before I did the authorative restore.

Lee.
Give it some time to see if things start to come to life now.

All working a treat!

Thank you kindly :-)

Lee.
Anytime.

Just keep in mind that AD is pretty good at correcting itself - you simply have to give it time to work.  Making constant changes while attempting to fix it is counterproductive and can be destructive.

Glad to help!
NM