Solved

SBS2003 Server DOWN. High Interrupts 50% + CPU

Posted on 2009-07-11
18
937 Views
Last Modified: 2012-05-07
Please Help, This is the second weekend I have been working on this issue, the server is down now until I get it fixed. Thx in advance for your time and input.

SBS2003 Server has a problem with high CPU usage.  Process Explorer shows Interrupts" with CPU usage of 20% - 60%. It constantly bounces up and down. The server is very slow and sometimes unresponsive. Everything does work, that is exchange is up, emails go in and out, shared drive woks, but the server is so stinking slow that its all basically worthless.

Background:
Intel S3200SH Server, XEON 3065, 4GB MEM.
Currently 1x WD 500GB ABYS SATA drive.    Was RAID1 using onboard Intel Matrix RAID.

Server is about 18 months old, 20 users. This problem first surfaced about a month ago, seemed to be of little or no concern, server was slower than normal but didnt pay attention as there were no apparent issue. However a few weeks ago the server started locking up at 10:00pm, when the backup would kick off. Then often the server would BSOD during the day and at night, server would last anywhere from 1  30 hours then BSOD at random.  The error codes are 0x000000c2 and 0x000000c4. Nothing that really helps.

I have updated server drivers, chipset, SATA controller, video, BIOS to version 47, stopped everything from starting up, disabled backup exec services, removed the Sony tape drive, removed the tape drive controller card, tried different hard drives, SATA cables, MEM test.

I have done the SBS backup and restore as the directions say, (on a separate HDD) when the first SBS disk 1 loads the server works great. Install SP2, reboot, run ntbackup and restore, reboot and WHAMMY server is SLLLOWWWW, high interrupts again.

What can I do? How can I fix this thing?
PLEASE HELP.
Thx again.  

High-Interrupts.png
0
Comment
Question by:Natetech
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 9
  • 9
18 Comments
 
LVL 23

Expert Comment

by:ComputerTechie
ID: 24832991
i would run a memory check using memtest86 next i would also check for virus using combofix from beepingcomputes.com

CT
0
 

Author Comment

by:Natetech
ID: 24833015
Thank you, however I have done both those things. I let memtest make 2 full passes. AV scans have come back clean.

Natetech - :(
0
 
LVL 23

Expert Comment

by:ComputerTechie
ID: 24833020
change your Kaspersky Anti-Virus  i have seen it take servers out.

CT
0
Manage your data center from practically anywhere

The KN8164V features HD resolution of 1920 x 1200, FIPS 140-2 with level 1 security standards and virtual media transmissions at twice the speed. Built for reliability, the KN series provides local console and remote over IP access, ensuring 24/7 availability to all servers.

 

Author Comment

by:Natetech
ID: 24833028
I know you see it running right now, but I have 100% turned off in testing and it made no difference.

Natetech
0
 
LVL 23

Expert Comment

by:ComputerTechie
ID: 24833037
try this  uninstall it and reboot and see if it makes a difference.
evan thought it disable it will still have services running.

CT
0
 

Author Comment

by:Natetech
ID: 24833124
I uninstalled Kaspersky, restarted and issue remains. Seams like its better though. Interrupts bounce from 5% to 20%. Server still lages, but better. There must be a problem still, which the AV amplified.

What else could be causing this interrupt issue?

Natetech
0
 
LVL 23

Expert Comment

by:ComputerTechie
ID: 24833134
Can you give me an undated screen shot and what is running in the backgound?
what is attached to the server directly?

CT
0
 

Author Comment

by:Natetech
ID: 24833233
See Screen Shot.
External: Power, Key, Mouse, LCD, Network.
Internal: SATA CDRom, Floppy, WD 500 SATA.

Seams like anytime the hard drive has activity is also when the interrupts increase. Maybe its unrelated but it happens at the same time. I have tried many different hard drives.

I have used Autoruns and in past tests i have turned off everything I can think of, and problem persists.
FYI: Problem does not persist in safe mode.

????
Natetech
Interrupts-2.png
0
 
LVL 23

Expert Comment

by:ComputerTechie
ID: 24833274
when you jdid the bios update did you reset to defaults after the update.
I would try that and then i would also try sfc /scannow.

I would also try running verifier.exe

Create Custom Settings
Select individual settings from list
Check Special Pool
Select drivers from a list
Click the provider heading to sort
Check everything that doesn't have Microsoft as the provider
Finish & reboot

CT
0
 

Author Comment

by:Natetech
ID: 24834910
wow i have had a heck of a time with sfc. I have used it tons of times but this time it wants the CD with SP2. I downloaded and installed SP2, so no CD as it wants. After searching online I did make an SBS Disk1 CD with SP2 slipstreamed in. The CD was made successfully.

I continued to reboot and reset BIOS to defaults, reboot.
Run sfc /scannow.
It prompts for the CD, which is in the drive and I hit retry, it reads the cd seams like its copying then asks again. I been doing this for hours, the progress bar is moving ever so slowly, but DUDE Ive never had this many issues with sfc. I dont know if its actually working or not. I put in different CD's and I do get instant messages that its wrong. So I dunno at this point. I have stopped sfc to run verifier.

Ran verifier, reboot. Server got stuck on black screen tring to POST Bios, nothing showed. I reset it 2 times, same issue. Pulled Power, drain power, plug in power, server boot. Now it shows the intel splash screen (which is new, must be because of the BIOS reset), server boots normal.
So far the Interrupts are blank, ocasionally showing .77 and up to 3 CPU usage. I am testing now. And I have to re-enable many services.
Still could not finish sfc.
Any further thoughts would be appreciated and rewarded. What was the verifier supposed to have done? did it do anything? I never saw any log or anything upon reboot.
Natetech

 
0
 
LVL 23

Accepted Solution

by:
ComputerTechie earned 500 total points
ID: 24835105
From what you told me about the bios is sounds like the bios updata did not finish until you reset it and cleared it.

here is information on verifier http://support.microsoft.com/kb/244617

I would check wait and see if the system will continue to be stable. I would reinstall SP2.

After that I would then crate a backup of the server and install your antivirus and make sure it contineus to run smoothly.

CT
0
 

Author Comment

by:Natetech
ID: 24835391
I re-installed SP2.
Currently the LAN computers can not see the server.

This was an issue before and after the SP2 re-install. They act like there is a firewall on the server. Now in testing on a spare box I had rad the SBS restore before i started this entire post, and experianced the same issue. Somehow on SBS the firewall was on. (yes I know its not supposed to be accessable in SBS). But it was, after disabling it the test box and test network worked fine.

On the current SBS server I can access \\servername from the server and it works fine. I can login to http://servername/exchange and RWW just fine.  I can access out from the server but cant do jack from the PC, the same PC that worked fine before. The PC can use the internet, but thats just dns i know.

On the current SBS server, when I attempt to right-click the LAN icon, Windows Firewall, i get an error pop up:
"Windows firewall cannot run because another program or service that might use the network address translation component (Ipnat.sys)"

The server only has 1 NIC.  I stopped RRAS, same issue.  The Windows Firewall Service is set to 'auto' yet not started, i start it, get an error:
"Could not start the windows firewall ............ Error 170: The requested resource is in use"

I ran: netsh winsock reset, rebooted server, (went and got mcdonalds durring reboot, yum), same issue
(Side Note: the server uptime emails come through durring the BOOT. Whereas before they would not come until several min after the server was up. )

I removed the server from RRAS console. Reboot. Issue Persists.
Still cannot start the windows firewall service, and none of the workstations can see the server. :(

Natetech
0
 
LVL 23

Expert Comment

by:ComputerTechie
ID: 24835514
SBS uses Routing and Remote Access as it's firewall.

I would go to windows components and features  and remove windows firewall and reinstall routing and remote access.

I agree with quick break of mcdonals too.

CT
0
 

Author Comment

by:Natetech
ID: 24836478
Removed windows firewall:           (option is not in windows components)
cmd, sc delete sharedaccess
Delete successful.

Ran RRAS wizard from todo list. Reboot.
Success. Workstations can access server, shares and exchange.

Monitoring and reporting does not work and Computer Browser will not start.
I re-ran the monitoring setup wizard, that works now.

Not sure how ti fix computer browser, dont know how to re-install that...?

Also the server will no longer boot with the external USB drive connected. We use it for redundant backup and near line storage. Its a NexStar vantec USB Hard Drive enclosure, they work great. But the server will not hit BIOS while it is on, the second you turn it off the BIOS appears, and the servfer boots, you can turn it right back on and everything is ok. This side effect is a problem though in production.
Any Thoughts?

I am restoreing user data now, then image, then install KAV again

Natetech



0
 
LVL 23

Expert Comment

by:ComputerTechie
ID: 24836564
here try this fix: http://www.hightechdad.com/2007/05/09/how-to-fix-master-browser-mrxsmb-event-id-8003-errors/

I woud check the bios setting for usb device is first on your boot list.
I have had this problem before and had to get a usb hub to fix it.

CT
0
 

Author Comment

by:Natetech
ID: 24836934
Installed KAV server, good.
Still need to install KAV for exchange and KAV admin console, but that will wait until another evening.

I jsut delivered the server to the client's office, tested a workstation, all seams good. Files sync'd, outlook send/recieve, shared drive, LOB, all seams good.

Server doesn't seam 100% perfect, but its 90% better. I have decided it is sufficiant to stay in operation as is.

My final thoughts are that the BIOS update / AV removal / Full SBS restore, the combination of, has resulted in the final fix. Took 2 weekends to fully pinpooint and complete.

Thank you very much ComputerTechie.
Natetech
0
 
LVL 23

Expert Comment

by:ComputerTechie
ID: 24836957
Anytime gland i could help.  BTW how did the usb drive boot issue turn out.

CT
0
 

Author Comment

by:Natetech
ID: 24841935
the intel bios has a setting for timeout to detect USB devices that had a 'potential' boot. The timeout was 60 sec. I never waited that long, I figured the sever was not booting after about 30-45 sec.

I set it as low as I could 10 sec. to disable it is to disable usb devices (stupid). So the server holds for 10 sec, then bios hits, posts and server boots like expected. Ahh whats 10 sec on a 8 min SBS boot process.

I checked with the client again this morning, they are very pleased, and I am 80-90% confident in the server. With weekly acronis images now added i will be 100% confident.

Thankx.
Natetech
0

Featured Post

Get 15 Days FREE Full-Featured Trial

Benefit from a mission critical IT monitoring with Monitis Premium or get it FREE for your entry level monitoring needs.
-Over 200,000 users
-More than 300,000 websites monitored
-Used in 197 countries
-Recommended by 98% of users

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Moving your enterprise fax infrastructure from in-house fax machines and servers to the cloud makes sense — from both an efficiency and productivity standpoint. But does migrating to a cloud fax solution mean you will no longer be able to send or re…
This article provides a convenient collection of links to Microsoft provided Security Patches for operating systems that have reached their End of Life support cycle. Included operating systems covered by this article are Windows XP,  Windows Server…
There's a multitude of different network monitoring solutions out there, and you're probably wondering what makes NetCrunch so special. It's completely agentless, but does let you create an agent, if you desire. It offers powerful scalability …
Monitoring a network: how to monitor network services and why? Michael Kulchisky, MCSE, MCSA, MCP, VTSP, VSP, CCSP outlines the philosophy behind service monitoring and why a handshake validation is critical in network monitoring. Software utilized …

623 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question