Link to home
Start Free TrialLog in
Avatar of arrowtech
arrowtech

asked on

Problem with NTFRS - Missing Sysvol and Netlogon on a 2003 SBS Server

I am having some apparently serious problems with a 2003 SBS server.

We have just started the process of putting in a second Domain Controller into this network for a project.

I performed the same task in a lab environment before I started on the live environment, and had no problems.

Firstly, we upgraded the SBS (tisserver) to 2003 R2. After that, I did the adprep to update the schema to R2.

I built up the 2003 R2 server (tisdr), installed DNS, joined it to the domain and did a DCPROMO.

This all worked fine, but I discovered errors in the File Replicaction Service event log on the new server:

---------------------------------------------------------------------------
Event ID 13508 - Source NtFRS
---------------------------------------------------------------------------
The File Replication Service is having trouble enabling replication from tisserver.TIS.local to TISDR for c:\windows\sysvol\domain using the DNS name tisserver.TIS.local. FRS will keep retrying.
 Following are some of the reasons you would see this warning.
 
 [1] FRS can not correctly resolve the DNS name tisserver.TIS.local from this computer.
 [2] FRS is not running on tisserver.TIS.local.
 [3] The topology information in the Active Directory for this replica has not yet replicated to all the Domain Controllers.
 
 This event log message will appear once per connection, After the problem is fixed you will see another event log message indicating that the connection has been established.

For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.
---------------------------------------------------------------------------

When I went and checked the SBS, I found the following error had been occurring:


---------------------------------------------------------------------------
Eventid ID 13568 - Source NtFrs
---------------------------------------------------------------------------
The File Replication Service has detected that the replica set "DOMAIN SYSTEM VOLUME (SYSVOL SHARE)" is in JRNL_WRAP_ERROR.
 
 Replica set name is    : "DOMAIN SYSTEM VOLUME (SYSVOL SHARE)"
 Replica root path is   : "c:\windows\sysvol\domain"
 Replica root volume is : "\\.\C:"
 A Replica set hits JRNL_WRAP_ERROR when the record that it is trying to read from the NTFS USN journal is not found.  This can occur because of one of the following reasons.
 
 [1] Volume "\\.\C:" has been formatted.
 [2] The NTFS USN journal on volume "\\.\C:" has been deleted.
 [3] The NTFS USN journal on volume "\\.\C:" has been truncated. Chkdsk can truncate the journal if it finds corrupt entries at the end of the journal.
 [4] File Replication Service was not running on this computer for a long time.
 [5] File Replication Service could not keep up with the rate of Disk IO activity on "\\.\C:".
 Setting the "Enable Journal Wrap Automatic Restore" registry parameter to 1 will cause the following recovery steps to be taken to automatically recover from this error state.
 [1] At the first poll, which will occur in 5 minutes, this computer will be deleted from the replica set. If you do not want to wait 5 minutes, then run "net stop ntfrs" followed by "net start ntfrs" to restart the File Replication Service.
 [2] At the poll following the deletion this computer will be re-added to the replica set. The re-addition will trigger a full tree sync for the replica set.
 
WARNING: During the recovery process data in the replica tree may be unavailable. You should reset the registry parameter described above to 0 to prevent automatic recovery from making the data unexpectedly unavailable if this error condition occurs again.
 
To change this registry parameter, run regedit.
 
Click on Start, Run and type regedit.
 
Expand HKEY_LOCAL_MACHINE.
Click down the key path:
   "System\CurrentControlSet\Services\NtFrs\Parameters"
Double click on the value name
   "Enable Journal Wrap Automatic Restore"
and update the value.
 
If the value name is not present you may add it with the New->DWORD Value function under the Edit Menu item. Type the value name exactly as shown above.

For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.

---------------------------------------------------------------------------

After doing a bit of reading, it seemed like the right thing to do was a non-authoritative resotre, so I went through and created the registry key, then stopped and started the NTFRS service.

As expected, I got:

---------------------------------------------------------------------------
EventID 13560 - Source NtFRS
---------------------------------------------------------------------------

The File Replication Service is deleting this computer from the replica set "DOMAIN SYSTEM VOLUME (SYSVOL SHARE)" as an attempt to recover from the error state,
 Error status = FrsErrorSuccess
 At the next poll, which will occur in 5 minutes, this computer will be re-added to the replica set. The re-addition will trigger a full tree sync for the replica set.

For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.
---------------------------------------------------------------------------


Exactly five minutes later, I got:


---------------------------------------------------------------------------
EventID 13520 - Source NtFRS
---------------------------------------------------------------------------
The File Replication Service moved the preexisting files in c:\windows\sysvol\domain to c:\windows\sysvol\domain\NtFrs_PreExisting___See_EventLog.
 
The File Replication Service may delete the files in c:\windows\sysvol\domain\NtFrs_PreExisting___See_EventLog at any time. Files can be saved from deletion by copying them out of c:\windows\sysvol\domain\NtFrs_PreExisting___See_EventLog. Copying the files into c:\windows\sysvol\domain may lead to name conflicts if the files already exist on some other replicating partner.
 
In some cases, the File Replication Service may copy a file from c:\windows\sysvol\domain\NtFrs_PreExisting___See_EventLog into c:\windows\sysvol\domain instead of replicating the file from some other replicating partner.
 
Space can be recovered at any time by deleting the files in c:\windows\sysvol\domain\NtFrs_PreExisting___See_EventLog.

For more information, see Help and Support Center at
---------------------------------------------------------------------------

&

---------------------------------------------------------------------------
EventID 13553 - Source NtFRS
---------------------------------------------------------------------------
The File Replication Service successfully added this computer to the following replica set:
    "DOMAIN SYSTEM VOLUME (SYSVOL SHARE)"
 
Information related to this event is shown below:
Computer DNS name is "tisserver.TIS.local"
Replica set member name is "TISSERVER"
Replica set root path is "c:\windows\sysvol\domain"
Replica staging directory path is "c:\windows\sysvol\staging\domain"
Replica working directory path is "c:\windows\ntfrs\jet"

For more information, see Help and Support Center at
---------------------------------------------------------------------------


&


---------------------------------------------------------------------------
EventID 13554 - Source NtFRS
---------------------------------------------------------------------------
The File Replication Service successfully added the connections shown below to the replica set:
    "DOMAIN SYSTEM VOLUME (SYSVOL SHARE)"
 
      "tisdr.TIS.local"
      "tisdr.TIS.local"
---------------------------------------------------------------------------


& Finally

---------------------------------------------------------------------------
EventID 13508 - Source NtFRS
---------------------------------------------------------------------------
The File Replication Service is having trouble enabling replication from TISDR to TISSERVER for c:\windows\sysvol\domain using the DNS name tisdr.TIS.local. FRS will keep retrying.
 Following are some of the reasons you would see this warning.
 
 [1] FRS can not correctly resolve the DNS name tisdr.TIS.local from this computer.
 [2] FRS is not running on tisdr.TIS.local.
 [3] The topology information in the Active Directory for this replica has not yet replicated to all the Domain Controllers.
 
 This event log message will appear once per connection, After the problem is fixed you will see another event log message indicating that the connection has been established.

For more information, see Help and Support Center at
---------------------------------------------------------------------------


It is only a small domain - 30 odd users, about the same amount of computers. I have left it for about half an hour now, and am not seeing any sign of a sysvol or netlogon share.

I haven't heard any reports of users not logging in yet, but they generally are not trying to at this time of day. I have been seeing errors trying to launch AD Users and Groups.

Frankly, I am packing it that i have screwed up this perfectly working domain, and am not sure where to go to from here.

The old backup files are still at C:\windows\sysvol\sysvol\TIS.local\NtFrs_PreExisting___See_EventLog - I am hoping that if I have really screwed up I can use them somehow to resolve my issue.

Your help would be greatly appreciated!
Avatar of arrowtech
arrowtech

ASKER

Having read further, I am wondering if the problem is that when it does the non-authoritative restore, it is trying to pull data across from the other DC, however that DC never replicated in the first place, so it is not getting anything?

I am scared to reboot in case I can't login, given that no machine is currently working correctly as a DC....
Just to add a little fun to the situation, I have just gone and checked the backup history to make sure that I can restore a system state in the worst case scenario - only to discover that the job wasn't set to backup System State!

This is turning into a bad day.
I have been reading a lot of similar threads all over the place in the last hour.

There are two things that I think that mean that this style of article is not the answer::

1 - I have moved past a journal-wrap problem - I would KILL to have that problem back....... From what I can tell, the non-authoritative restore wants to replicate the Sysvol info from another DC (makes sense really with the name). Since there is another DC, that is what it is trying to do, but since that DC never got its replica it can't happen.

2 - All the articles that describe doing an "Authoritative restore" seem to be referring to Server 2003, and setting the BurFlags setting to  D4 - it sounds like what i want to do, but I can't find any reference to doing this in 2k3 - only booting to DS restore and using NTBACKUP to restore a system state, and as per above I am shit out of luck on that one!

I am kind of hoping that the C:\windows\sysvol\sysvol\TIS.local\NtFrs_PreExisting___See_EventLog files will be my ticket out of here - something along the lines of copying them back to the sysvol directory and bouncing the NTFRS service, but I don't want to make it any worse blindly thrashing along, it is bad enough as it is.

I have just had someone on site tell me that they can't login, so it is definitely hosed.

A
ASKER CERTIFIED SOLUTION
Avatar of Jeffrey Kane - TechSoEasy
Jeffrey Kane - TechSoEasy
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
So do you think:

A) that I CAN dcpromo it out now that my domain is in this state?



B) that if I dcpromo it out, and then try to do a non-authoritative restore, that the SBS server will rebuild it's sysvol info from it's own files when I do a non-authoritative restore? The more I read about the non-authoritative restore, the more it seems like it is only designed to pull data from other DCs, not rebuild from it's own gear?

A
It is Fixed!!!!

More details to follow, thought I would just post so you could stop thinking about it.

A
OK, so here is the summary of what went wrong, why it went wrong, and how MS fixed it for me:

When I added the new DC, the existing DC was already in a "Journal Wrap" state. This is what stopped it replicating to the new DC.

The fix for Journal Wrap described in the event viewer is to copy all the sysvol files, and then perform an "non-authoritative restore" - basically replicating from another DC.

Unfortunately, because my DC was the only DC with a sysvol, it couldn't do this, so it just crapped out.

The fix from Microsoft Support was to copy the two folders (policies/scripts) back from  C:\windows\sysvol\sysvol\TIS.local\NtFrs_PreExisting___See_EventLog to the C:\windows\sysvol\sysvol\TIS.local\ folder, then stop the NTFRS service, then set the BurFlags key to D4, which does an Authoritative restore.

Bingo, my sysvol and netlogon are back, the Journal Wrap error is gone and we have replication over to the new DC - too easy.

He pointed me at this for future refernece:

http://support.microsoft.com/kb/290762

In some ways I feel bad for panicking and not working it out myself, but in other ways I don't.

When you do the Journal Wrap Fix/Non-Authoritative restore, there is nothing to tell you clearly that is has all failed.

All the articles on the D4 Authoritative Restore are quite scary, and I was worried I was just going to make things worse.

I am just glad MS had my back!

A
Points to TechSoEasy for having some good suggestions, and just being there.

When you think you have screwed up massively, it is a nice feeling to think someone else cares too......
No problem... glad you got it going again... I've been there... and am always amazed when things somehow just start working again!

Jeff
TechSoEasy
I cannot thank you enough for what this article saved me...

Thank you arrowtech !
Awesome this fixed mine, cheers for posting it.