Unable to determine cause of high CPU on system interrupts

Hello,
For 3 days now I have been trying to figure out why my system is always using 10-15% CPU for system interrupts.  I googled and tried every tool I know of - windows performance toolkit, latencymon, process explorer, process monitor.  They all indicate that CPU 0 is spiking which I already know, but it doesn't exactly identify what is causing the issue.  I might need help to better understand what these tools are saying.

My computer specs are:
ASUS P8B-M motherboard
Intel E3-1230 v1 Xeon CPU (4C8T)
16 GB Crucial DDR-1333 ECC memory
ASUS Nvidia GTX 660 display card (PCI-e x16)
ASUS Xonar DGX sound card (PCI-e x1)
Also used onboard VGA aspeed 16 MB
Also used PNY Nvidia Quadro NVS 420 (PCI-e x16)
Intel SSD 2500 180 GB
Seagate constellation 250 GB
ASUS infineon TPM module
4-port USB 2.0 hub
LG Blueray drive

I have tried removing as many components as possible.  I swapped the GTX 660 video card for on-board video and the NVS 420.  I removed RAM, removed the sound card, removed the TPM and swapped the SSD with the Seagate.  Unplugged the USB hub.

I used the latest BIOS for the motherboard which is 6702.  I rolled it back to 6003, did not help.  I loaded optimal defaults in BIOS.  I changed SATA mode from AHCI and RAID.

I performed a fresh install of both Windows 7 64 and Windows 8.1 U1 64, no additional software installed  I have formatted the hard drive on each install case.  I also tried booting windows into Safe mode with no network.

Immediately after boot up and login, both have the same issue of showing 10-15% CPU which is the max for one thread in one core.  I understand that some .net assemblies run at first boot up but that is not the cause here, the CPU indefinitely stays at that usage level.

I tried turning off hyper-threading and turning of some of the cores.  That just spiked the CPU even higher.

I have updated to the latest drivers provided by ASUS.  Also tried Driver Easy to get all latest drivers.  Also tried disabling some components in device manager such as serial ports and the network ports.  Also tried unplugging the network cables.  I tried adjusting sound settings and my card doesn't have any enhancement modes that could use up CPU.

I scanned the system for viruses and malware and rootkits.  Used ESET Nod32, Malware bytes anti-malware, malware bytes anti-rootkit, and hitmanpro.  None of them found anything.

This issue is degrading performance and causing the system to run hotter.  But for the life of me, I cannot figure what the problem is.

CPU performance
CPU resource
LVL 17
bigeven2002Asked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

bigeven2002Author Commented:
Hi thanks for the reply.  The workstation is actually over 4 years old and is rack mounted.  The rack is on carpet but is on casters.  When I first built the computer this didn't happen so I have no idea how long this has been going on.

Due to its age I doubt the manufacturers would swap it out and to my understanding Intel has stopped technical support for their CPUs.  I'll still send them and asus inquiries though.

Static discharge may be a possibility.  I'll see if I can go about removing the static.
Christopher Jay WolffWiggle My Legs, OwnerCommented:
Hi again.

The only reason I mentioned static was as a possible cause of chip or discrete component failure on the board.  Since you feel the failure already exists, I wouldn't bother now to worry about static discharge, except to prevent further damage in the future.  It is when you touch something conductive with your fingers after building up a static charge by friction, say with your shoes rubbing across regular carpet.  Kinda like emitting a spark from your fingertip that would cause a chip to fail.  Companies sometimes install carpet with copper thread running through carpeting in labs where lots of people handle lots of boards.  Then they ground the carpet here and there so everyone is always discharged.  For you, you probably just need to be aware, and therefore careful.  So before touching the insides of your Asus, touch metal ground first like a furnace duct or the chassis of a device which should be grounded to the wall outlet ground.  Any sparks are then dissipated before you work on the Asus.

You got me looking at my Resource Manager and see my CPU utilization sitting at about 15% as you stated in your first paragraph.  I look at my cores and see a little difference.  Your first core looks like it sits at 80-90.  Am I getting this right?  That core just stays that way at idle?
DavidPresidentCommented:
It seems to me that the ONLY thing wrong is that the process monitor chews up a rather small percentage of freely available CPU power on one CPU.     The nature of all the tools you use are CPU pigs in as themselves.   Your diagnostics themselves trap and count all interrupts and threads and pretty much everything

A solution, to paraphrase Goucho Marx.  Dr., It hurts when I raise my arm this way, how do I stop it?   (Dr. Answers -- Don't raise your arm that way, that will be $10)

In other words, the measurement tool you use is the cause of the CPU spike.   If you want to comfirm it is not hardware, then boot the system to Linux on a live DVD so it is ram based.  Run the top program from command line.  That program will report all processes and cpu overhead including top.   You will see similar results, but top willl show it chews up fewer cycles because it is command-line.
Discover the Answer to Productive IT

Discover app within WatchGuard's Wi-Fi Cloud helps you optimize W-Fi user experience with the most complete set of visibility, troubleshooting, and network health features. Quickly pinpointing network problems will lead to more happy users and most importantly, productive IT.

Christopher Jay WolffWiggle My Legs, OwnerCommented:
Hi dlethe-If that is true then wouldn't my quad core have 80-90% on core0 when running Resource Manager?  or if you're saying he's loading all that software at boot, then why wouldn't it use some of the other cores?  Why are his other cores down where you might expect?

Here's a link for us.  They discuss an experiment to force CPU usage high by letting it go to sleep.  The fix for that person was to set sleep to "never", while a gamer set his icons to different size.  Hmm.  The interesting one for me was the HDD going bad listed by someone else as number eight.

http://answers.microsoft.com/en-us/windows/forum/windows_xp-performance/why-my-cpu-usage-is-always-high-even-when-the/367c22b7-0357-4a1c-9be1-86fc9a146b54?auth=1
DavidPresidentCommented:
" ... If that is true then wouldn't my quad core have 80-90% on core0 when running Resource Manager"?
- Answer, because it won't.   It sleeps between each measurement, and it does not run at highest system priority.
bigeven2002Author Commented:
Thanks again for the replies.  Yes I always hold on to the power supply or other metal when handling the board.  I set the system to use only two cores and cpu was at 50% when looking in task manager so I doubt that itself would use an entire core.  I saw 15% usage with all cores because of hyper threading so one single threaded process won't ever exceed 15% CPU in a 4 core 8 thread setup.
DavidPresidentCommented:
Run windows process explorer. This give full details on all jobs
https://technet.microsoft.com/en-us/sysinternals/bb896653.aspx
Christopher Jay WolffWiggle My Legs, OwnerCommented:
That's a great suggestion.

You guys are beyond me, but here is a link that I think explains it to me.  Unless you of course tell me it's too different and doesn't apply here.  The conclusion seems to be that when idle process kicks in, there is not much else to measure so the visual graph goes way up as idle process continually checks for interrupts.  If idle is in core0 maybe that explains the graph.  Like saying the idle process is taking up more CPU because it is so idle.  But it is still a process so shows up as huge graph.

Am I way off here?  Here's the link.

http://www.tomshardware.com/forum/28594-63-performance-processes

Anyway, process explorer should show.
DavidPresidentCommented:
Exactly, Christopher --- Maybe I should phrase it this way.  If Windows resource explorer is the biggest consumer of resources ... then your system is not at all busy and you have nothing to worry about.  

A system with little to do other than to monitor resource usage will give bogus details.

So all you need to worry about is IF there are jobs that take significantly higher percentage of resources than perfmon.  If there are, and those applications run fast enough, in your opinion, then there is no problem to begin with.

If perfmon is the biggest user of resources, then your system is profoundly idle and all is well and good with the world.
JustInCaseCommented:
My experience with higher CPU usage - reason system interrupts :
1. Bad drivers.
2. DMI mode of HDD is not DMA (PIO mode), or processes with high HDD I/O rate.
3. USB HUBS - try to disable it and see if there is change
4. Sound effects - right click on volume icon near clock - sound device - first screen choose active device - right click - properties - enchantments - disable all sound effects.
nobusCommented:
do a clean boot, and see if it still happens :
run msconfig, select startup tab, click disable all
reboot to test
**you can re-enable in groups or one by one, if the problem was gone
JustInCaseCommented:
Yes, I forgot that one:
startup items and scheduled tasks
 :)
bigeven2002Author Commented:
I seem to have lost control of my system. I have even unplugged all power from the motherboard including the CMOS battery and the 24 pin connector and it is still getting power somehow. I think I may have determined that something hardware related or even worse is wrong with the system.

Mb power no plug or battery
JustInCaseCommented:
No, you did not loose control of your system.
 :)
Led light keep shining for a long time after power is cut, or some other cables (like VGA and HDMI - 5V) could provide power for keeping those lights on. If you remove all cables .. led light with turn off.
I don't think it is hardware problem.
nobusCommented:
normally, leds stay on not much longer than 20-30sec
you can also run from a live cd, and see the cpu utilisation then
here a Knoppix live cd : http://www.knopper.net/knoppix-mirrors/index-en.html
bigeven2002Author Commented:
Ok phew.  The LEDs stayed on all night but I saw that I didn't unplug an external powered usb hub.  Once I removed that the lights went out.

I will try knoppix later today.

I tried windows bundled drivers and manufacturer drivers.  I turned off all sounds in control panel and disconnected all usb except keyboard and mouse.  I did try safe mode which should have removed all start up items but even in safe mode cpu thread was maxed out.
DavidPresidentCommented:
Author - you still have not provided any evidence of a process other than task manager which is using significant CPU resources.   Based on what you have posted before.  I see no problem here other than a misunderstanding about how task manager gets it's data -- by hooking into those very same system interrupts you see that are causing the load you are concerned about.

Run MSFT process manager and look at every task.  If no task uses more resources than task manager. Give up ... there is no problem.
JustInCaseCommented:
On the other hand I believe that screen shoot shows enough.... One core usage about 80-90% (that single core is powerful as some dual or quad core processors with all cores active - (examples of power - AMD Phenom II X2 555, Intel Core2 Duo E7600, Intel Pentium B980, AMD A4-5300 APU, Intel Core2 Duo E8300...)
If we calculate 80% of single core ((80/100 * 8 cores) /8 cores) * 100 = 8% of total CPU power
So, that is significant CPU power wasted on interrupts ...
 On my laptop with dual core B980 system interrupts is showing 0% with peak to 2% so average is at my laptop for system interrupts is 1.1 - 1.4% (on weak dual core processor), on the screen shoot above you can see 9.5% average of total CPU power - I believe it is simply too much.
Also you can see on task CPU total - red line is selected process - it is almost all that cpu is doing.
Snapshoot from my laptop
Ther same process hereYou almost cannot see red line.
 :)
nobusCommented:
it largely depends on what softw- and hardware is installed - and running
did you try a clean boot?  did it run better then?
JustInCaseCommented:
And, just in case, try to disconnect keyboard and mouse to check are they causing problems.
:)
DavidPresidentCommented:
There is still ZERO indication of anything wrong. Your system has processor power management and speed step.  The CPU is saving you money by lowering the CPU frequency when there is nothing to do.   This causes numbers to appear high.

The O/S could very well be lowering your CPU speed to 33% or lower clock rate than it is capable of running.   That makes process manager appear to take 3X more CPU power.

Proof?  Turn off all power saving, turboboost (you want it in max performance), turn off speedstep, configure for maximum performance, turn off processor power management (PPM).    Then see the overhead of everything drop accordingly.   Make sure all cores are turned on.  Some of this has to be done in BIOS, some in the O/S.  Either way, this exercise will prove either:

1. I am wrong, and there IS an application bottleneck that does not show up.
2. There is no problem as i maintained, and you are misinterpreting how to read the numbers.  The cpu usage is high because the CPU cranked itself down to save power because you aren't asking it to do much.
Christopher Jay WolffWiggle My Legs, OwnerCommented:
Hi.
Wanted to post this explanation of Perfmon's calculation that was posted from a Microsoft employee, Hoping to clear up that part.  The link it's from is below.


Many people confuses what they see in Task Manager on the Processes tab in CPU column with Process\% Processor Time\Instance in Perfmon.  They are NOT the same counters.   There is NO counter in PerfMon that matches what you see in Task Manager on the Processes tab in CPU column.

Process\% Processor Time\Instance is NOT the amount of time that the CPU’s were busy.  It is the % of time that this instance charges against the Processor\% User time.


The theoretical Max for this counter is (# of processors * 100)

Assume the following:

A single CPU and we are looking a single point of time

(processor\%processor time) = 10%

(processor\%user time) = 8%

(processor\% privilege time) = 2%

(process\% processor time\your application) = 80%

You application is using 80% of the (processor\% user time) which is (8*.8)=6.4% of the CPU

If you have multiple processors they you will need to divide the (process\% processor time\your application) by the number of processors to determine what will be charged to % user time.

I know of no easy way to get how much CPU a process is using from perfmon.

Taskmgr utilizes a NtQuerySystemInformation call for this value.  This is a different method then what perfmon is using.

Cheers!
Bruce Adamczak

https://social.msdn.microsoft.com/Forums/windowsdesktop/en-US/0435e7c5-3cda-41a0-953e-7fa462fde03b/perfmon-process-processor-time-vs-task-managers-cpu-usage-for-monitoring-a-specific-user?forum=perfmon
Christopher Jay WolffWiggle My Legs, OwnerCommented:
Same guy in a TechNet post with slightly different wording where he says about perfmon

Process\% Processor Time\Instance is NOT the amount of time that the CPU’s were busy.  It is the % of time that this instance charges against the Processor\% User time.

from
https://social.technet.microsoft.com/forums/en-US/70a31eeb-d305-4606-a168-3feb63b6df32/how-do-perfmon-and-task-manager-calculate-a-process-cpu-usage
bigeven2002Author Commented:
Ok sorry it took so long.  Here is an update.  This is going to be several screenshots but I hope this answers all questions so far.

First, I wanted to show the comparison from one of my other systems which is a Dell Optiplex.  As you can see, there is 0% CPU usage and this is the same Windows 7 x64 and temperatures show normal and you can see neither CPU shows any load.

Dell Task managerDell Resource monitorDell CPU Temp

Next, now for my workstation, I ran Knoppix and opened a shell window and ran Top.  Here is the screenshot for that.  The load average in the upper right never drops below 0.25.

Knoppix Top
Now, for Windows 7, below are the task manager screenshot and the temperature of CPU.  You can see that Core 1 is under constant heavy load and this is at idle causing elevated temperatures.

Windows 7 task  managerWindows 7 CPU temp
For Windows 8, it showed similar results with first core under constant load at idle, and you can also see that the CPU is using full speed so speedstep is not active.  In fact I even saw it turbo boost to 3.5 GHz at one point even at idle.

Winodws 8 task managerWindows 8 CPU usageWindows 8 CPU temp
Then in the BIOS, you can see the CPU temp is lower and I also turned speed step off during troubleshooting.

BIOS CPU tempSpeedstep off
Lastly, for process explorer, below are screenshots of the interrupts usage and properties.

Process explorer CPUProcess explorer interrupts properties
DavidPresidentCommented:
Just for grins, did you also disable the windows system assessment tool ?  That does chew up cpu cycles needlessly  Start -> Task Scheduler -> Microsoft -> Windows -> Maintenance -> Disable WinSAT


(But system idle process of 99% means the system is idle 99% of the time, NOT as you imply the system is allocating 99% of CPU to running that process).
bigeven2002Author Commented:
I turned off WinSAT and rebooted.  No change.  My screenshots earlier were meant to show that even though system idle process showed 99%, the CPU still showed 10% usage in 4C8T mode.

Also, I decided to try pure single core mode (1 active core, no hyper thread, speed step off).  The system nearly choked and took 5-10 times longer to boot and load programs.  Here is a task manager screenshot of that.  The system idle process is 99%, yet the CPU usage is 75%.

1 core task mgr
And just to show the temp for that 1 core while it inexplicably runs at near full power:

1 core temp
So it's like a permanent single core stress test going on.
JustInCaseCommented:
Why do you have two antiviruses active at the same time? :)
This is starting to seem to me like hardware problem. Any chance of clean windows install again, but also remove all devices from motherboard for this and leave only what is essential  for system to boot up.
cpu
1 hdd
1 ram
1 optic device (if you install from DVD) or flash - both just until first restart.
and weakest GPU that you have
and nothing else but keyboard, mouse and display attached
(no front USB or anything)
For this I usually get motherboard out of case.
If right after install there is still CPU problem with minimum equipment and only windows drivers - most likely it is hardware problem (some pin on cpu socket damaged or faulty motherboard???).
bigeven2002Author Commented:
I agree it may be hardware related.  I opened a support ticket with Intel so I might hear back from them.  I already tried barebone approach with the hardware and it made no difference.

That was CPU, 1 stick ram, hdd. Dvd drive, keyboard, mouse, onboard video, no sound.  I also tried switching out power supply but that did not work either.

In my normal setup, I run antivirus and malware bytes.  Cyber world is too dangerous to only have one nowadays.  Most of my screenshots were before their installation though.
Christopher Jay WolffWiggle My Legs, OwnerCommented:
It does seem like the CPU.  While waiting for Intel, I was wondering if you've tried the Intel diag and monitoring tools.  Maybe that is the temp tool you're using.  At the Intel site

http://www.intel.com/support/utilitytools.htm

if you expand Diagnostic Tools and Monitoring Tools I thought it might be possible to learn something from one of these packages.  What do you think?
JustInCaseCommented:
I don't think issue is related to antiviruses. But from my experience - 2 antiviruses work at the same time.... it's not  good at all...
:)
Christopher Jay WolffWiggle My Legs, OwnerCommented:
Intel just got back to me and wants the output from their diagnostic tool which they provide the link to as

https://downloadcenter.intel.com/download/19792/Intel-Processor-Diagnostic-Tool-64-bit-

Maybe it is the same as one of the ones I provided earlier, but might as well use the link here that is from tech support I guess.
bigeven2002Author Commented:
Ok thanks.  I will check the utility links this evening to get the data they need.
Christopher Jay WolffWiggle My Legs, OwnerCommented:
Hi People.

bigeven2002
Thanks for diag report, Intel support emailed back and wants it run again a little differently as I mentioned in PM.  You now have the latest from Intel, and when you get next diag report to me I will forward again.  He has read the EE thread as of a couple days ago and thinks there is definitely something wrong.  Wants to investigate CPU fan model number also.
Christopher Jay WolffWiggle My Legs, OwnerCommented:
bigeven
thanks for quick turn-around on running diag again.  Your new results are already sent to Intel.
bigeven2002Author Commented:
Just to update everyone, we submitted a report to Intel on CPU testing with the Intel diagnostic utility.  The CPU went up to 4 ghz when testing and this is not an unlocked processor and bios was at defaults.

  It appears my onboard vga has malfunctioned since when I remove the nvidia adapter and set the vga jumper to enable, the system beeps 4 times and there is no display signal.

I can probably lookup that error later but if anyone knows what that means let me know thanks.
nobusCommented:
seems you have had other hardware problems - possibly a bad mobo..
Christopher Jay WolffWiggle My Legs, OwnerCommented:
Hi again people.
Intel wondered about the motherboard, and I restated the original post by bigeven showed the core problem with on-board video before on-board video became unavailable during this testing sequence.

Intel said Asus BIOS overclocks based on temperature and behavior, which would explain the variance in clocking.  My guess is the 4 years of overclocking has heat abused your Xeon a little and maybe caused failure.  4 years isn't too bad though, and if you can test Xeon with a different motherboard with a friend or local shop it would confirm.
bigeven2002Author Commented:
Thanks for the update.  I was completely unaware of the asus auto overclocking.  I'm going to try a different board as soon as possible.
Christopher Jay WolffWiggle My Legs, OwnerCommented:
I just noticed in your first post screen grab of ResMon that core1 is parked.  When you can, could your find out how to take control of parking and unparking a core on your machine.  Maybe someone here knows.  I cannot look into it right now.  This link discusses how people are getting results like you and are improving things by unparking.

http://www.overclock.net/t/1240512/cpu-core-parking-good-or-bad
bigeven2002Author Commented:
I found how to turn off the parking.  Unfortunately no change though.

CPU parking off
bigeven2002Author Commented:
OK so my final update.  I ended up replacing both the motherboard and CPU and reused everything else.  The replacement was successful.  The CPU now shows 0% at idle and 10-15 degrees lower on temperature.  This time around, I am leaving turbo boost off.

As to why I didn't just try replacing the board, It was a gamble to get just a LGA 1155 board that supports E3-1200 CPUs.  No local retailer has them, meaning I would have to buy online.  Considering that the CPU is likely damaged from the constant overclock and the motherboard has dead VGA, I think both components are ready for retirement.

Thanks to all who contributed and a big thanks to Christopher Jay Wolff for handling the ticket and support information from Intel.

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
bigeven2002Author Commented:
Replaced hardware.
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Microsoft Legacy OS

From novice to tech pro — start learning today.