userenv event error 1058 and 1030 causing primary Domain controller to crash

We are running Window Server 2003-

We are getting userenv error 1058:

Windows cannot access the file gpt.ini for GPO cn={4CB2BC94-186C-4D1B-A557-0E04488514CB},cn=policies,cn=system,DC=cvn75,DC=navy,DC=mil. The file must be present at the location <\\DomainName\sysvol\DomainName\Policies\{4CB2BC94-186C-4D1B-A557-0E04488514CB}\gpt.ini>. (Access is denied. ). Group Policy processing aborted.

And userenv error 1030:

Windows cannot query for the list of Group Policy objects. Check the event log for possible messages previously logged by the policy engine that describes the reason for this.

We get these errors many times and our primary DC is cut from the network. Roaming profiles are stored on this server so this creates MANY problem.  It doesn't happen very often but it helps to know how to fix it.

-thank you-
Josef Al-ChacarSystems AdministratorAsked:
Who is Participating?
I actually wrote an article on how to troubleshoot this and fix it. The article needs some editing. I hope this helps:
If your server is cut of the network entirely by no apparent reason it may be a Confiker issue, so check your stations for this threats that attacks the network components of several windows versions
Josef Al-ChacarSystems AdministratorAuthor Commented:
It could be but i'm also a trouble call tech and ive never seen this issue on a workstation.  I really don't think it's any type of worm.
Upgrade your Question Security!

Your question, your audience. Choose who sees your identity—and your question—with question security.

There are 3 states in wich the attack from Confiker class variants, does different things

In my experience with this type

1) The OS is completly vulnerable, transparently it infects and sends its copies via SMTP

2) The OS is partialy resistant, the hack attempt to the IP component makes the process crash and the system is cut off the network, you can tell this because the service called "Server" is down

3) The OS is fully resistant, and and only gets infected when the Windows Firewall is intentionaly disabled

This is just a hint, when you get the trouble, check on the state of the service processes, and analize the server performance, for more clues
1058 and 1030 errors are usually due to resource exhaustion issues (be it viral or other problem).  Use Task Manager or Performance Monitor and look at a handle consumption and PTEs.  If the handle consumption is high, look at the process that's consuming it and consider re-installing that application if possible.  If your PTEs are low (<5000), then you should look at tuning your memory.  

Here are some additional questions that will help us resolve your 1058/1030 issues:
What version (x64/x86) of Windows are you running?  
Are you using the /3GB switch?  
What other services are running on this box?
How frequently do these errors occur?
Josef Al-ChacarSystems AdministratorAuthor Commented:
We have x86

As far as i know we have 1gb switch

AD, Symantec AV (a prime cause for many of our problems.) Roaming profiles are stored here
The error only occurs about once every 2 months but we end up having to reboot the primary DC to fix the problem.

I'm in the military so any system downtime is critical.
It definitely sounds like a resource leak and not a viral issue.  Use the steps I identified earlier to check handle and PTE use.  Since it is a slow leak, you'll have to monitor it over a couple days to find the source.  By chance are you running HP Open View or Hercules?
I agree that it's very possible to be a resource leak.

On the other side, a reinstall is not likely to solve the issue, as a licking software does it, because a design flaw. If you happen to identify any process that progresively hogs resources, you may start to consider a clustered fail-over implementation with a scheduled downtime on each node, wich will likely need to replace or upgrade that application.

For any critical service, its almost mandatory to use some cluster cloud computing.
Scheduled downtimes are a tradition in pre-cloud application services, there is now something better, use it.

In my experiences with leaking services, the entire system reboots when it is unable to assign more resources to even for the most critical windows components, and I cannot simply change the service, it's corporation mandatory, i only restart at scheduled intervals and await for the headquarters to send me a newer (hopefully corrected) software. This things are most commonly associated with software designed with old techniques and or compilers, and developers, if you have the direct contact with them, most of the time dont want to acknowledge their soft is buggy. An if they do recognize any bug they will only offer you their new version at its corresponding upgrade cost, and no guarranties.
Josef Al-ChacarSystems AdministratorAuthor Commented:
Thank you both

I will look in to this tonight. Ill let you know what i find out. I have debugdiag.exe which tests for leaky services so i'll take a look

It may be a while for a response. i work the night shift
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.