Server freeze when corrupt user profile attempts to load

We've got a farm of Citrix Presentation Server 4.0 running on W2K3 EE no SP. After a power crash, seems that several ntuser.dat files in the user's roaming profiles (stored on NAS) have become corrupt (I've used regrecover to show that the subkeys on the user hives are scrambled, only a few still readable). We've observed that in many ocassions, when the user attempts to log into the server, the server becomes unresponsive. No further RDP/ICA or RPC exec connections can be made, but it responds to ping and you can still browse the c$ share on it. A brief kernel dump analysis shows registry resource contention (winlogon0, but no sub-process information). I was wondering if anybody came accross this before and if there is a way to determine which process is causing the server to freeze.
Who is Participating?

Improve company productivity with a Business Account.Sign Up

ee_autoConnect With a Mentor Commented:
Question PAQ'd, 500 points refunded, and stored in the solution database.
if you know the profiles are corrupt does creating a new profile help the issue?
croitoruAuthor Commented:
Hi, thanks - of course it does. The point here is root cause analysis and identifying which  specific patch needs to be deployed to prevent a server falling over because of the corrupt profiles (which might become corrupt again) WITHOUT slamming a full SP in the environment.
Free Tool: IP Lookup

Get more info about an IP address or domain name, such as organization, abuse contacts and geolocation.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Cláudio RodriguesFounder and CEOCommented:
Your best answer is to simply move away from Roaming Profiles. It is a best practice on Citrix/TS to use a hybrid solution (using Mandatory profiles with a mechanism to save/restore user settings/preferences - so you get the best of both worlds). There are many ways to achieve this from the freeware FlexProfiles to third party commercial add-ons like SimplifyProfiles, Managed Profiles, Sepago Profiles, etc.
Honestly there is no way to fix your issue. It will happen again, it is just a matter of time.

Claudio Rodrigues
Microsoft MVP
Windows Server - Terminal Services
Give Claudio the points

croitoruAuthor Commented:
Hi Claudio, thank you for your comment. I've been following your internet posts for many years and always found them very useful. Unfortunately, this environment was not setup by myself, as I am well-aware of the downsides. The point of this exercise is to identify the faulting module (which is not handling the corrupt registry key) and going back to Citrix or MS with the fault, asking for a fix. I've attempted to reproduce the problem on SP2 patched W2K3 boxes and I am unable to, hence I believe there is a pre sp1 or sp2 fix to it.
I attach the trace of a full mem dump generated with the NMI button in the lab (after loading the corrupt profile and locking the box).
What I am after again, is a way to determine which module/dll/process locks up the box.
Many thanks
BugCheck 50, {888dfff3, 1, 8050c534, 0}
*** ERROR: Module load completed but symbols could not be loaded for CtxSbx.sys
*** ERROR: Module load completed but symbols could not be loaded for CtxAltStr.sys
Probably caused by : CtxSbx.sys ( CtxSbx+376a )
Followup: MachineOwner
WRITE_ADDRESS:  888dfff3 
!analyse -v
8050c534 ff1500f14d80    call    dword ptr [nt!_imp_KeReleaseQueuedSpinLock (804df100)]
PROCESS_NAME:  services.exe
TRAP_FRAME:  b970c7bc -- (.trap 0xffffffffb970c7bc)
ErrCode = 00000002
eax=f82dc0e1 ebx=00000000 ecx=000000e1 edx=00000003 esi=d6664578 edi=824df3b8
eip=8050c534 esp=b970c830 ebp=b970ca14 iopl=0         nv up ei pl nz na po nc
cs=0008  ss=0010  ds=0023  es=0023  fs=0030  gs=0000             efl=00010202
8050c534 ff1500f14d80    call    dword ptr [nt!_imp_KeReleaseQueuedSpinLock (804df100)] ds:0023:804df100={hal!KeReleaseQueuedSpinLock (807490a0)}
Resetting default scope
LAST_CONTROL_TRANSFER:  from 8052fa2f to 80543ac9
b970c754 8052fa2f 00000050 888dfff3 00000001 nt!KeBugCheckEx+0x19
b970c7a4 804e2dfc 00000001 888dfff3 00000000 nt!MmAccessFault+0x796
b970c7a4 8050c534 00000001 888dfff3 00000000 nt!KiTrap0E+0xc8
b970ca14 f82dc7c1 b970ca30 817c85b0 817c8764 nt!CcSetActiveVacb+0x10a
b970cb84 804f0473 8266b020 817c85b0 817c85b0 Ntfs!NtfsFsdCleanup+0xcf
b970cb94 b9ba776a 00000000 82344648 b970cbe8 nt!IofCallDriver+0x3f
WARNING: Stack unwind information not available. Following frames may be wrong.
b970cba4 b9ba4621 82570020 817c85b0 804f0473 CtxSbx+0x376a
b970cbe8 ba12597c 82491ac0 007c85b0 804f0473 CtxSbx+0x621
b970cc6c 804f0473 82337ea8 817c85b0 817c85b0 CtxAltStr+0x397c
b970cc7c 8058edff 824df3a0 826ceca0 00000001 nt!IofCallDriver+0x3f
b970ccac 80586b89 8252c880 82337ea8 00020000 nt!IopCloseFile+0x27c
b970ccdc 80586ca6 8252c880 824df3b8 826ceca0 nt!ObpDecrementHandleCount+0x121
b970cd04 80586d24 d67def58 824df3b8 00000210 nt!ObpCloseHandleTableEntry+0x12f
b970cd4c 80586d87 00000210 00000001 804dfd24 nt!ObpCloseHandle+0x80
b970cd58 804dfd24 00000210 00000000 00000000 nt!NtClose+0x17
b970cd58 7ffe0304 00000210 00000000 00000000 nt!KiSystemService+0xd0
0119f230 77f4139c 77e48aec 00000210 757c57c6 SharedUserData!SystemCallStub+0x4
0119f234 77e48aec 00000210 757c57c6 00000210 ntdll!ZwClose+0xc
0119f23c 757c57c6 00000210 0017d210 77f50a87 kernel32!CloseHandle+0x55
0119f258 757cb1e4 0017d210 00000001 00000001 SCESRV!ScepGetNamedSecurityInfo+0x89
0119f2ac 757cc4c7 0017d210 016dc6e8 00000000 SCESRV!ScepGetNewSecurity+0x111
0119f740 757cc825 016db6f8 00000004 016dc6e8 SCESRV!ScepSetSecurityOverwriteExplicit+0x413
0119f778 757ccd4a 016d8568 00000001 00000ab0 SCESRV!ScepConfigureOneSubTreeFile+0x1e1
0119f7a8 757cce39 00000000 00000001 00000ab0 SCESRV!ScepConfigureObjectTree+0x18a
0119f7d8 757cce39 00000000 00130e60 00000ab0 SCESRV!ScepConfigureObjectTree+0x279
0119f808 757ba4fc 00000000 000a46c0 00000ab0 SCESRV!ScepConfigureObjectTree+0x279
0119f860 757c1463 00158018 00000040 00000001 SCESRV!ScepConfigureObjectSecurity+0x320
0119f8c8 757acd4f 01718128 001379b0 02050112 SCESRV!ScepConfigureSystem+0x37c
0119f928 77c57b40 00159958 01718128 00000000 SCESRV!SceRpcConfigureSystem+0x25d
0119f960 77ce50b8 757acae8 0119fb2c 00000009 RPCRT4!Invoke+0x30
0119fd48 77ce53c7 00000000 00000000 001599cc RPCRT4!NdrStubCall2+0x269
0119fd64 77c57557 001599cc 000a13c8 001599cc RPCRT4!NdrServerCall2+0x17
0119fd98 77c57419 757d8087 001599cc 0119fe38 RPCRT4!DispatchToStubInCNoAvrf+0x38
0119fdec 77c574d9 00000005 00000000 757e229c RPCRT4!RPC_INTERFACE::DispatchToStubWorker+0x112
0119fe10 77c5ef56 001599cc 00000000 757e229c RPCRT4!RPC_INTERFACE::DispatchToStub+0xa1
0119fe44 77c5f05e 00159990 01718118 00159958 RPCRT4!OSF_SCALL::DispatchHelper+0x144
0119fe58 77c5ee01 00000000 77c50005 00000001 RPCRT4!OSF_SCALL::DispatchRPCCall+0xfc
0119fe90 77c5eded 01718100 03000b38 00000000 RPCRT4!OSF_SCALL::ProcessReceivedPDU+0x5df
0119feb0 77c5edd7 01718100 00000b38 77e48774 RPCRT4!OSF_SCALL::BeginRpcCall+0x21e
0119ff10 77c5e2a0 00000000 01718100 00000b38 RPCRT4!OSF_SCONNECTION::ProcessReceiveComplete+0x473
0119ff20 77c5dd5e 0009dce0 0000000c 00000000 RPCRT4!ProcessConnectionServerReceivedEvent+0x20
0119ff8c 77c58379 77c88bb1 0009dce0 00000000 RPCRT4!LOADABLE_TRANSPORT::ProcessIOEvents+0x1b6
0119ff90 77c88bb1 0009dce0 00000000 00000000 RPCRT4!ProcessIOEventsWrapper+0x9
0119ffb0 77c88c97 0009d760 77e53bd4 000dfaa0 RPCRT4!BaseCachedThreadRoutine+0x9c
0119ffb8 77e53bd4 000dfaa0 00000000 00000000 RPCRT4!ThreadStartRoutine+0x17
0119ffec 00000000 77c88c80 000dfaa0 00000000 kernel32!BaseThreadStart+0x34
b9ba776a eb03            jmp     CtxSbx+0x376f (b9ba776f)
SYMBOL_NAME:  CtxSbx+376a
FOLLOWUP_NAME:  MachineOwner
IMAGE_NAME:  CtxSbx.sys
FAILURE_BUCKET_ID:  0x50_W_CtxSbx+376a
BUCKET_ID:  0x50_W_CtxSbx+376a
Followup: MachineOwner

Open in new window

Cláudio RodriguesFounder and CEOCommented:
The problem is Citrix related, specifically on the AIE filter. This hotfix addressed this specific problem among other things:

Claudio Rodrigues
Microsoft MVP
Windows Server - Terminal Services
croitoruAuthor Commented:
Hi Claudio
Thank you for this. The hotfix is already installed. Citrix just came back with a suggestion about an internal PRE SP1 fix for W2K3 which describes the reg contention we're seeing, just about to get that confirmed from MS and via our test labs. Manyt hanks.
croitoruAuthor Commented:
It turns out to be a bug in W2K3. I thought MS got rid of the profile corruption breaking TS servers in 2000. Obviously this is back. Upgrading to SP2 dealt with the problem
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.