Link to home
Start Free TrialLog in
Avatar of AnthonyWald
AnthonyWald

asked on

Microsft Windows Server 2003 R2 Server crash - Error code 00000020, parameter1 00000000, parameter2 0000ffff, parameter3 00000000, parameter4 00000001

I have a Win 2k3 R2 box that is creating a crash minidump and rebooting. The server is running Terminal Services with Citrix Presentation Server 4.0
Does anyone have a fix for this. We have already tested the RAM using Memtest86 which came back clean.
The minidump analaysis is below:

Microsoft (R) Windows Debugger Version 6.9.0003.113 X86
Copyright (c) Microsoft Corporation. All rights reserved.


Loading Dump File [\\citrix1\u$\WINDOWS\Minidump\Mini020609-02.dmp]
Mini Kernel Dump File: Only registers and stack trace are available

Symbol search path is: C:\Windows\Symbols
Executable search path is:
Unable to load image ntoskrnl.exe, Win32 error 0n2
*** WARNING: Unable to verify timestamp for ntoskrnl.exe
Windows Server 2003 Kernel Version 3790 (Service Pack 2) MP (8 procs) Free x86 compatible
Product: Server, suite: TerminalServer
Kernel base = 0x80800000 PsLoadedModuleList = 0x808af9c8
Debug session time: Fri Feb  6 14:37:51.138 2009 (GMT+9)
System Uptime: 0 days 0:52:53.671
Unable to load image ntoskrnl.exe, Win32 error 0n2
*** WARNING: Unable to verify timestamp for ntoskrnl.exe
Loading Kernel Symbols
..........................................................................................................................
Loading User Symbols
Loading unloaded module list
...
*******************************************************************************
*                                                                             *
*                        Bugcheck Analysis                                    *
*                                                                             *
*******************************************************************************

Use !analyze -v to get detailed debugging information.

BugCheck 20, {0, ffff, 0, 1}

Probably caused by : memory_corruption ( nt!MiRemoveUnusedSegments+18 )

Followup: MachineOwner
---------

5: kd> !analyze -v
*******************************************************************************
*                                                                             *
*                        Bugcheck Analysis                                    *
*                                                                             *
*******************************************************************************

KERNEL_APC_PENDING_DURING_EXIT (20)
The key data item is the thread's APC disable count.
If this is non-zero, then this is the source of the problem.
The APC disable count is decremented each time a driver calls
KeEnterCriticalRegion, KeInitializeMutex, or FsRtlEnterFileSystem.  The APC
disable count is incremented each time a driver calls KeLeaveCriticalRegion,
KeReleaseMutex, or FsRtlExitFileSystem.  Since these calls should always be in
pairs, this value should be zero when a thread exits.  A negative value
indicates that a driver has disabled APC calls without re-enabling them.  A
positive value indicates that the reverse is true.
If you ever see this error, be very suspicious of all drivers installed on the
machine -- especially unusual or non-standard drivers.  Third party file
system redirectors are especially suspicious since they do not generally
receive the heavy duty testing that NTFS, FAT, RDR, etc receive.
This current IRQL should also be 0.  If it is not, that a driver's
cancelation routine can cause this bugcheck by returning at an elevated
IRQL.  Always attempt to note what you were doing/closing at the
time of the crash, and note all of the installed drivers at the time of
the crash.  This symptom is usually a severe bug in a third party
driver.
Arguments:
Arg1: 00000000, The address of the APC found pending during exit.
Arg2: 0000ffff, The thread's APC disable count
Arg3: 00000000, The current IRQL
Arg4: 00000001

Debugging Details:
------------------


BUGCHECK_STR:  0x20_NULLAPC_KAPC_NEGATIVE

CUSTOMER_CRASH_COUNT:  2

DEFAULT_BUCKET_ID:  DRIVER_FAULT_SERVER_MINIDUMP

PROCESS_NAME:  WINWORD.EXE

CURRENT_IRQL:  0

LAST_CONTROL_TRANSFER:  from 80967a84 to 8087c4a0

STACK_TEXT:  
b99e5c80 80967a84 00000020 00000000 0000ffff nt!MiRemoveUnusedSegments+0x18
b99e5d18 809206e4 00000000 00000000 869fe020 nt!ExpAllocateHandleTableEntry+0xc7
b99e5d30 8091f92c 869fe020 00000000 00000001 nt!CmpMapCmView+0x192
b99e5d54 80833bef 00000000 00000000 05f0ff78 nt!NtQueryInformationToken+0x11de
b99e5d64 7c8285ec badb0d00 05f0ff68 00000000 nt!MmAccessFault+0x7a0
WARNING: Frame IP not in any known module. Following frames may be wrong.
b99e5d78 00000000 00000000 00000000 00000000 0x7c8285ec


STACK_COMMAND:  kb

FOLLOWUP_IP:
nt!MiRemoveUnusedSegments+18
8087c4a0 5d              pop     ebp

SYMBOL_STACK_INDEX:  0

SYMBOL_NAME:  nt!MiRemoveUnusedSegments+18

FOLLOWUP_NAME:  MachineOwner

MODULE_NAME: nt

DEBUG_FLR_IMAGE_TIMESTAMP:  48a2bc85

IMAGE_NAME:  memory_corruption

FAILURE_BUCKET_ID:  0x20_NULLAPC_KAPC_NEGATIVE_nt!MiRemoveUnusedSegments+18

BUCKET_ID:  0x20_NULLAPC_KAPC_NEGATIVE_nt!MiRemoveUnusedSegments+18

Followup: MachineOwner
---------


Any answers would be greatly appreciated.
Thanks
Anthony

Avatar of Abhay Pujari
Abhay Pujari
Flag of India image

Have you tried server repair?
Avatar of AnthonyWald
AnthonyWald

ASKER

Hi abhvp,
I am not sure what you mean by server repair?
Boot it from CD. Then proceed as normal installation. Accept agreement and then it will search for the available Windows OS. On that screen, press R. Which server R you using?
Also from dump it is stating that winword.exe is causing problem. You can remove this program from recovery console.
I didn't realise that is what you were suggesting. I understand the repair from CD and such but that is something I don't think I would actually do. It is a Windows Server 2003 with Citrix running on it. I would be very nervous doing that sort of action to a production server. I would expect it would actually break.the setup of various software on the server. I have known it to not give the desired results in the past.
You did however give me an idea and put the citrix into install mode and ran a detect and repair within Microsoft Word. That may fix it.
Sounds good. Try it and post results here.
We still have the issue after the detect and repair I ran last weeks. We had the minidump's and a reboot.

The minidump is identical to the one above
We still have this issue.

Does anyone have any answers?
Have you checked memory?
We have already run a Memtest86 on the memory and it came back clean.
Well, then memory is not a problem. What events are logged in the event viewer under system events? Have you tried running system file checker or even checkdisk?
I have run checkdisk and it came back with no errors.

I have also provwed it is definately not hardware related as the physical machine has just been virutalised.
If it not Disk and Memory, then try checking third party drivers and updates from the date this problem is started. Your mini dump is not saying anything specific, so we need to rule out each and every possibility to get the root cause of the problem.
It does not look like it is drivers or updates. The error above talks about WINWORD.exe.

I think I may have found the issue. The server has not crashed for 8 days straight. We started running a little script to log what processes everyone on the server was running. We found a small correlation between a certain person using the Citrix desktop on the server and when it crashed.
I totally reset and wiped her profile on the server and we have not had a crash for 8 days.

I am still keeping an eye on the server, by this time next week if I have no crashes I will be happy knowing the problem is fixed.
ASKER CERTIFIED SOLUTION
Avatar of AnthonyWald
AnthonyWald

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial