Link to home
Start Free TrialLog in
Avatar of Keithburnham
Keithburnham

asked on

Problem with active directory every morning at 6:00am on Windows server 2003 PDC.

For a few weeks now, our role holding domain controller has had issues with the directory service. There are countless different issues, but they are resolved by a reboot.

The following day, however, the problem comes back at exactly 6:00am. I can't seem to find anything that occurs at that point, as all the backups take place 12 hours earlier. Replication is at standard time intervals and nothing I can is obviously wrong. Here is one of the event log messages at the very beginning of the issue at 6:00am:

The description for Event ID ( 1 ) in Source ( LGTO_Sync ) cannot be found. The local computer may not have the necessary registry information or message DLL files to display messages from a remote computer. You may be able to use the /AUXSOURCE= flag to retrieve this description; see Help and Support for details. The following information is part of the event: , Flush Completed.

This is of course not very helpful, but is the first indication of an issue in the system log. Then I get the following:

NTDS (412) NTDSA: An attempt to write to the file "C:\WINDOWS\NTDS\edb.log" at offset 3468288 (0x000000000034ec00) for 512 (0x00000200) bytes failed after 5 seconds with system error 1784 (0x000006f8): "The supplied user buffer is not valid for the requested operation. ".  The write operation will fail with error -1011 (0xfffffc0d).  If this error persists then the file may be damaged and may need to be restored from a previous backup.

This would appear to be a storage issue, but the server is a virtual one sitting on a SAN (ESX server 3.5). I have moved the entire virtual machine storage to another volume and this has not helped. I have ran Windows disk checks and they have come up fine. I have searched on the 10-15 different event log messages and they all seem to be generic resultant messages of the directory service being down, and nothing to with specific causes. I have various KDC messages about the security account manager, but again this is no doubt due to AD being down.

Any help would be gladly appreciated as to what could be scheduled at 6 every day and could cause this DS error. It is way beyond my knowledge.
Avatar of martin_babarik
martin_babarik
Flag of Czechia image

Hello,

I'd try to reboot the server into Directory Services Restore mode and move the edb log file using NTDSUtil to another location, just to see if this helps somehow.
Also I suggest to use ntdsutil to run integrity check on the database.

Martin
Avatar of Keithburnham
Keithburnham

ASKER

Thanks for that will try now. With regards NTDSutil and the integrity check, is that the 'semantic database analysis' option? I have never delved into these tools I must admit.
Yes you're right. To be honest I'm not sure if this has something to do with the edb file, but it's not gonna do any harm anyway, so why not to try....
But what should be more helpful is the following:
1. Reboot the DC into DSRM.
2. From command prompt type "ntdsutil" and then enter the following command:
files
integrity


When it completes it will output that you should run semantic checker now - do so please.

When done, verify the results - if it works better or whether you've got some errors to pay attention to.

If you'll have no satisfying results, try to move all AD files to another folder (ideally on another volume) - also from ntdsutil - files - "move db to" and "move logs to" commands.
Martin
Thanks Martin. It looks like a few things to try.

The files integrity check came out fine, with the file output being empty and a passed remark on the screen.

The semantic output looks as follows:

Deleted object 4165 does not have a deletion time. Error 1004
Deleted object 8975 does not have a deletion time. Error 1004
Deleted object 10967 does not have a deletion time. Error 1004
Deleted object 10972 does not have a deletion time. Error 1004
Summary:
Active objects         9612
Phantoms               2
Deleted                   344
Security descriptor summary:
SD count:                925
Total SD size before single-instancing:                         7626 kb
Total SD size after single-instancing:                             928  kb


I'm not entirely sure if be using the 'go fix' option will have done anything, but of course this issue occurs every morning so I can't force the error to be checked. Looks like I will have to be patient. I will move the files anyway I think, just in case, and let you know if that has fixed it. Thanks again!
ASKER CERTIFIED SOLUTION
Avatar of martin_babarik
martin_babarik
Flag of Czechia image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Now that is interesting. I imagine that could be it, but I don't know of a snapshot task that occurs at that time. I will definitely look though. Good find.
The problem was the LGTO_sync driver. I'd assumed that the snapshots were taken at 5am, but it seems I got some GMT time zone mix-up and there was a LUN snapshot occurring at that time (which triggered a VMware snapshot). My efforts were hindered by a problem with the group policy objects that made me go a different path in the first place. Many thanks though, Martin.
Wow, you did great job :-) Quite far from group policies :-)
Thank you for the points.

All the best
Martin