studios
asked on
Server slips into condition with 2nd CPU at 100%, for no apparent reason.
We recently built a dual AMD Opteron Windows 2003 Standard server for a customer. After a period of bringing it into their network environment and loading applications, it has developed a bad habit. After some hours of operation ( 4 to 100), we see the second CPU at 100% utilization, and the first CPU running at 10-25%. Everything is very sluggish. It takes a long time to log in, for screens to refresh, for it to provide the applications, and serve the database. But it does muddle through. We can down it gracefully, reboot, and then it comes back as quick and strong as it should be.
So I am running task manager to see these CPU utilization conditions. If I look in at the Processes tab and sort by CPU, it shows the System Idle Process at 95-99%. And there are no processes that appear to be soaking up CPU horsepower. Yet on the Performance tab, the second CPU is pegged to the top, with occasional drops to 98%.
The server is Windows 2003 Standard, SP1, fully patched and security patched. It is a server for 8 workstations, is the Active Domain controller, the CA eTrust Anti-Virus server, 2Point FaxServe (formerly AccPac FaxServe) server, is running an instance of MSDN SQL engine for an application called DocStar (a document imaging/storage/retrieval application). It has a Novell client loaded, and GroupWise. The event logs look clean. There are no obvious signs of hardware failure.
Hardware is a GigaByte 7A8DRH motherboard with two Opteron 244 CPUs, 2gb RAM, Adaptec 2020ZCR card, SCSI RAID-5 array
What can I do to figure out what is saturating the second CPU?
So I am running task manager to see these CPU utilization conditions. If I look in at the Processes tab and sort by CPU, it shows the System Idle Process at 95-99%. And there are no processes that appear to be soaking up CPU horsepower. Yet on the Performance tab, the second CPU is pegged to the top, with occasional drops to 98%.
The server is Windows 2003 Standard, SP1, fully patched and security patched. It is a server for 8 workstations, is the Active Domain controller, the CA eTrust Anti-Virus server, 2Point FaxServe (formerly AccPac FaxServe) server, is running an instance of MSDN SQL engine for an application called DocStar (a document imaging/storage/retrieval application). It has a Novell client loaded, and GroupWise. The event logs look clean. There are no obvious signs of hardware failure.
Hardware is a GigaByte 7A8DRH motherboard with two Opteron 244 CPUs, 2gb RAM, Adaptec 2020ZCR card, SCSI RAID-5 array
What can I do to figure out what is saturating the second CPU?
ASKER
I removed the eTrust Anti-Virus yesterday, and it has been well-behaved since.
eTrust wasn't behaving right, and I had a hunch it was somehow related to this problem.
Taking it out seems to have cleared up the problem. I don't know how I would have figured this out using logic.
I was running Systernals Process Explorer, which showed about 40% of the CPU going to hardware interrupts. How that links back to eTrust is unknown to me at this point.
eTrust wasn't behaving right, and I had a hunch it was somehow related to this problem.
Taking it out seems to have cleared up the problem. I don't know how I would have figured this out using logic.
I was running Systernals Process Explorer, which showed about 40% of the CPU going to hardware interrupts. How that links back to eTrust is unknown to me at this point.
ASKER
Yes, let's close it.
The real problem turned out to be a modem on the motherboard COM1 port. It was not eTrust as I had suspected at one point. Nor was it APC PowerChute, which was a suspect at another point.
When I removed the modem, this problem went away.
The real problem turned out to be a modem on the motherboard COM1 port. It was not eTrust as I had suspected at one point. Nor was it APC PowerChute, which was a suspect at another point.
When I removed the modem, this problem went away.
Wow, thanks for coming back and adding that - hopefully the next person that has the same problem will have a better time of resolving it!
-red
EE Cleanup Volunteer
-red
EE Cleanup Volunteer
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
anyway i would recomend you to run the performance logs to get more information about the CPU loads
high cpu loads could be because of an aplication which is using java and have some trouble.
check if all services that are set to automatic start are running when the high cpu load occurs