Extremely Urgent - Citrix Issue Winlogon 100% CPU.

History: Server has been been in production for 3 years +. Yesterday (Friday) we uninstalled Adobe Acrobat and installed it again. After the system restart the issue below started. We are not sure if this was a cause of the issue or not.

Issue: Winlogon.exe uses 100% CPU and system is unuseable even from startup.

What is installed: Windows 2000 SP4, Citrix XPA SP3 FR3

What we have tried:
-Virus scans
-Malware programs, Adaware, HiJackthis, Spybot etc.
-Uninstalled all sorts of programs, particuarly programs that connect to the server such as Veritas. Uninstalled Adobe Acrobat.
-Unplugged Network cable, system runs fine. No issues with Winlogon.exe.
-Reconnect Network cable, Winlogon immediatly goes to 100% CPU.
-Disconnect Internet from Cisco Ehternet 0 (disable internet to lan) System boots fine and works fine. No issues with winlogon. Able to connect to server with RDP and Citrix Client localy with no issues.
-Reconnect Internet to Cisco Ethernet 0 problem does not return untill a connection from outside is attempted.
-Checked the routers (cisco 2651) ip auditing from outside. No unushall activity or aparant attacks of any kind. All packets and bytes look normal.
-Deleted AltAdder from system. System runs fine.
-Changed IP address of Citrix Server and created new static Nat mapping on Cisco (to totally isolate connection to citrix server for me to test). I was able to connect to the citrix server frominside fine, and outside fine from my machine. System worked ok. Had 2 more people connect from outside and on the third connection the Winlogon process immediatly went to 100% cpu and the server ground to a halt. Did not observe the 3rd connection in Management Console.
-After continued testing, it has been determined that the Winlogon process does not use 100% untill an outside connection to it is attempted.

We are unable to determine the cause of this problem and need some assistance. This is a production Citrix server and downtime is bad thing (isnt it always).

We are bringing up another citrix server to see if it is an issue with the server itself. We are still in the middle of this process. Any advise would be greatly appreciated.

ekoeslingAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

ITDharamPresidentCommented:
Well, if the winlogon.exe process shows up as all lowercase, then you may in fact have a virus despite your scans.

There are also issues with printing problems causing the winlogon process to spike like that.  Any printing problems, services stuck in 'starting'?  I doubt it considering the behaviour when connecting to the internet.

The fact that disconnecting internet access has such an obvious affect on your system, I'd redouble efforts to look for virus/malware.  Netsky.D is one that can cause similar behaviour.  

Maybe do a netstat -a when you have it disconnected, and it is running fine.  Then connect up the internet again, trigger the spike, and do another netstat -a.  Look for anything out of the ordinary.  

Which version of Acrobat did you install?
jrc4728Commented:
Look in your registry and see if there is a key in there similar to this.
HKEY_LOCAL_MACHINE\Software\Microsoft\Windows\CurrentVersion\Run\"W1N32.DLL" = C:\WINDOWS\WINLOGON .exe  or something else that is starting winlogon.
That would indicate a virus. Have you tried at least two different virus scans? http://housecall.antivirus.com in safe mode has proven effective at finding what a lot of others miss.

However, there are other causes. I ran into a similar situation with spoolsv.exe. See the article below.
http://support.microsoft.com/default.aspx?scid=kb;en-us;822834
You might try applying that patch.

Step #3 if those dont resolve would be to look for a service that is trying to start with credentials. Run MSCONFIG, disable all non-microsoft services and remove everything from startup. If that solves the problem, add them back one at a time till the problem comes back.

Step #4 (I've been using Citrix for 10+ years and have had to do this at least 3 times) remove Citrix, reinstall SP on Windows and reinstall Citrix. Not as painful as it sounds.

Good luck.


ekoeslingAuthor Commented:
Unfortunatly nothing worked, we ended up formatting the system. =)

Thanks all for the valiant efforts.
Your Guide to Achieving IT Business Success

The IT Service Excellence Tool Kit has best practices to keep your clients happy and business booming. Inside, you’ll find everything you need to increase client satisfaction and retention, become more competitive, and increase your overall success.

hanleybCommented:
I had the same problem last week with several of my newly installed Citrix servers.
My Servers:
Windows 2000 Server
Citrix Metaframe Xpe FR3
Latest Windows critical updates.

I was in a hurry last week and installed three new Citrix servers - exactly the same as I always have (I've got 10 other Citrix servers in this farm).  Anyway, part of my setup process is to fully patch the server with Windows OS patches before putting into production. This caused the problem.
There is a known problem with one of the July Windows updates called: Microsoft Rollup Update 1 for Windows 2000. It causes very high CPU utilization when installed on a Citrix Metaframe XP server.  See this Citrix article for the hotfix:
http://support.citrix.com/kb/entry.jspa?externalID=CTX107051

Or, you can just go to the REGEDT32
HKLM\Software\Citrix\CtxHook\AppInit_Dlls\Smart Card Hook
Rename the file path name from scardhook.dll to scardhook.dll.old
Reboot

This registry fix instantly remedied the problem. I did this registry fix before they had the hotfix out, but assume that's all the hotfix does anyway.

Good luck

Brian

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
ITDharamPresidentCommented:
I recommend a force accept with points awarded to hanleyb
Tech GuyVP of TechnologyCommented:
I didn't read 100% a solution here, but allow me to share our solution.

We had this problem for 2 months running on the free version of VMWare Server. The Server is a 64-bit quad-core blahblahblah Dell Blade with 8GB of RAM (more than adequate)

The first go around, the VM Image was a clone of an existing server. After it began tanking out at 100% we decided to build a clean VM Image as a VM from birth. Same issue, we got to about 5 Citrix users and we'd be pegged at 100%.

We almost went with a bare metal instance, but instead we tried copying the Citrix VM Images off of the SAN, and run them from the VM Server's local hard drives. Now we're topping 50 simultaneous users no problem.

Moral of story is, in our situation running the VM's from the SAN was the bottleneck-- even though we are running several other production servers from the SAN without a problem.

I hope this info helps someone else from going bald.
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Citrix

From novice to tech pro — start learning today.