Link to home
Start Free TrialLog in
Avatar of peterjwest
peterjwest

asked on

Nonpaged Pool on 2003 Server is empty.

Hello,

We have recently begun to have some problems with our 2003 Server DC.  The Server was working just fine but over the past week or so we have begun to have to forcefully reset the Server nearly every day.

The problem is first realised when our Insight Management System reports that the server is no longer responding to pings.  We then visit the Server to find that the machine appears to be booted - but when we try to unlock the console it reports that it is unable to do so because insufficient resources are available.  In the end the only way to reboot the server is:

i) Forcefully reset using the power button.
ii) Use the ILO to remotely connect and force a warm reboot.

Upon rebooting the Server and checking the Event Log we find the following area has been logged many many times prior to the machine becoming non-responsive:

Source: Srv
Category: None
Event ID: 2019
Type: Error
Description: The server was unable to allocate from the system nonpaged pool because the pool was empty.

As an example the first of these errors was recorded at 02:36AM this morning.  I then received an error from the Insight Manager at 02:43AM saying that the Server was no longer responding.  Looking further on through the even log we then see these errors:

Source: NAVAP
Category: None
Event ID: 1001
Type: Warning
Description: System memory is running very low.  Norton AntiVirus Realtime Protection may not be able to function properly.

Source: Application Popup
Category: None
Event ID: 333
Type: Error
Description: An I/O operation initiated by the Registry failed unrecoverably. The Registry could not read in, or write out, or flush, one of the files that contain the system's image of the Registry.

However I believe that these errors are both related to a lack of memory on the system so if we can resolve these issues then we shouldn't see these messages again.  The message from Srv then repeats itself every 60 seconds whilst the one raised by Application Popup occurs every 20 seconds or so.

Of course it seems that something is consuming the nonpaged pool but the strange this is that this issue always seems to occur at the same time in the middle of the night.  Here are a few of the things we have done.

1. Disabled the Volume Shadow Copy and Microsoft Software Shadow Copy Provider Services.  This is in relation to a KB I read about backup software causing problems with the VSC Service which resulted in a memory leak which consumed the non-paged pool.

2. Disabled the backup and all ArcServe related Services (using BrightStor 11 IIRC).

3. Set up PerfMon to collect statistics from Memory\Pool Nonpaged Allocs and Memory\Pool Nonpaged Bytes.  Also set up monitoring of Process\Pool Nonpaged Bytes.  These counters when graphed in Perfmon don't show anything of interest - but i'm using selected processes so if the process causing the problem is spawned at night then it could be that Perfmon is missing it.

4. Created a scheduled task to run PoolMon and dump the output to file every 15 minutes.  It seems that a Tag called THRE is taking up significant amount of non-paged pool space and it gradually increases in size throughout the day.  Just prior to the problem this morning the Tag had a byte size which equated to 202MB.

I'm running out of ideas so any suggestions which can be offered would be greatly apprecaited.

Pete
Avatar of sofestibeest
sofestibeest

Hi,

It could be a memory leak.
Memoery leaks can have this effect on any machine.

Check what updates or apps were least installed.
Perhaps an update is causing the problem.

Second, you can use msconfig and find the tag and set it to prevent starting during bootup.

Goodluck

Avatar of peterjwest

ASKER

I've ended up using MemTriage to diagnose the problem and it looks like the issue related to a process called HPBPRO.EXE which relates to an HP Printer Driver.

We have now obtained a fix from the HP Website and initially we were seeing a lot of DCOM Errors in the system log.  This pointed to one of our technical staff who also had the driver installed locally - he has now removed the driver and instead of seeing a long-lived instance of the process it appears for a short period and then vanishes again.

We are now seeing that the process consumes very little memory and the problem with it creating huge numbers of threads has also ceased.

I'll leave this question open for a while longer just to confirm the issue has definately been resolved - but at the moment it's looking hopeful.
ASKER CERTIFIED SOLUTION
Avatar of DarthMod
DarthMod
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial