Link to home
Start Free TrialLog in
Avatar of westes
westes

asked on

Windows 8.1 Enterprise Pauses Frequently

I have a Windows 8.1 Enterprise 64-Bit installation with a perplexing performance issue.   The system has 16 processor cores and is unloaded at around 5% to 10% CPU usage.  There is 32GB of physical memory, but only about 8 to 10 GB are in use.   There is very little network traffic.   The problem is that a few hours after the system is cold booted, you can use a browser application - or any application it does not matter which one - and the OS just freezes for up to 15 seconds.   You lose the ability to have your keystrokes reflected in any data entry window.   After the system unlocks, the application performs normally, but then 10 to 30 seconds later you get another pause cycle.

At this point, I am unclear on whether this is a badly behaving device driver or system application.   Since applications run under processor cores, I am not understanding how an application could lock up the entire OS.   My intuition is more that some device driver that works with an application is behaving badly and locking up at a key point, thus locking out the entire OS.

How do I debug this?
Avatar of John
John
Flag of Canada image

I think I would first try Memory (memtest86.exe) and Hard Drive (your manufacturer's hard drive test).

If all passes, consider a Windows 8.1 Refresh which keeps all data and basic Apps but makes you reinstall Software. I have done this - not too bad a chore.
Avatar of westes
westes

ASKER

I have run memtest and I have done CHKDSK on the hard drives and there are no issues there.

I have at least 100 hours invested in configuring this computer.   I will not start over from scratch.   I want to debug it if it can be debugged.
If memory and drive are OK (Chkdks does not find all errors), the find ALL the drivers for the machine and reinstall them all.

Windows 8 refresh can be done in a day or somewhat less.

Also strongly consider that Windows 8 is a complete orphan like Vista and then upgrade to Windows 10 Pro. If you do this, and given the machine, do a fresh install.
ASKER CERTIFIED SOLUTION
Avatar of David Johnson, CD
David Johnson, CD
Flag of Canada image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of westes

ASKER

David, I looked through a few days of System and Application event viewer messages, and there are no warnings or errors pertaining to the disks.

Your thinking is that some critical section of a Disk IO driver is pausing waiting for hardware to respond?   That's a good theory, but then why does the error stop on a reboot and only start up some time after reboot is complete?    It feels more like some driver - that is not active at startup - is getting activated or placed into the error mode.
Did you try updating all your Windows 8 computer drivers?
Avatar of westes

ASKER

Yes, I run Windows update frequently, and Dell's drivers for the T7600 have been stable for over a year.    I could try to find the actual Intel chipset on the board and try applying Intel's latest chipset driver from their website.   That usually makes good things happen on the systems I try it.
Try that (Chipset)

Otherwise, if not hardware and not drivers, then it is software.

You might consider (if the machine will do it) biting the bullet and upgrading to Windows 10 as Windows 8's market share is dwindling fast.
Avatar of westes

ASKER

John, I installed Intel's 2.6 software update utility.  It "scanned" the system for drivers and only found the network card.   I tried to update that and the install fails repeatedly.  Thankfully the old drivers stay behind and still work.   No option to update or confirm driver levels on the chipset.

The Intel software update utility is HORRIBLY written.   It looks almost toyish.   It's sad where the quality level of support software for workstations and PCs has fallen.

I don't see an upgrade to Windows 10 as addressing this issue.  I either have a failing disk or controller (as outlined by another poster), or I have a bottleneck inside one application like Google Chrome.   There needs to be a process for positively identifying where that bottleneck exists.  I can solve specific problems once I prove the problem exists.   If I upgrade I probably just take either one of those problems right along with me.   And to get from 8.1 to 10 will be weeks of planning and work for me, while probably still not resolving anything.

Microsoft software licensing is a nightmare.  We already spent money to buy legal Windows 8.1 Enterprise licenses.   Because we might call in on those once every three years, we did not buy "software assurance".   Now it is time for that one call, and Microsoft gives us no paid options for support other than to spend $500/call or $2000 for a five-pack of calls.   What happened to the old email-only support tickets at $99.    I would rather put the $500 for that support call into buying new hardware that takes part of the load off that failing computer.

I feel more and more discouraged by personal computers and where this has all ended.   The future is clearly in the Cloud and avoiding whenever possible additional interface to Microsoft software.   I feel that Microsoft is living in a daydream where it still believes that it owns the computing universe, and their support and software licensing policies reflect that.   It's just too difficult to deal with this.   Instead of making it easy for people to stay with their architecture, they go out of their way to make it difficult to own and support their products.
You had said CHKDSK did not find errors. Did you try the Hard Drive manufacturer's disk test?  I would do that.

What make of Computer?  I usually use the manufacturer's update program. If I want Intel drivers I go to the site and enter the NIC or CPU designation. I have used Intel NIC drivers. I only use Lenovo Chipset drivers.

I have a ThinkPad X230 (now a spare) that came with Windows 8 / 8.1. I upgraded it to Windows 10 in the free period and the difference was remarkable (reduction of errors and better performance). I now use a ThinkPad X1 Carbon that came with Windows 10 Pro and it is working fine.
Avatar of westes

ASKER

CHKDSK finds no errors.  No errors in EventViewer.   The RAID integrity tool works from BIOS and finds no issues with the RAID volume.

I have a Dell T7600 and of course I have the latest Dell drivers.   I did not find any recent updates.

Dell's support for this system has been awful.  The RAID controller originally had a Windows GUI tool that worked, but after some time simply did not come up (even after removing and re-installing) and Dell does not support it.   I even tried going to the OEM and getting the latest software from them, bypassing Dell, and that does not work either.  My best guess is there is some subtle security issue and something was tightened on the OS that they never tested against.

I will end up going with some lower power computers running Windows 10 and I will slowly migrate workload over to those Windows 10 platforms as time allows.   I suspect I may have too many browser windows open on my desktop, and Chrome simply cannot manage it well.
Browser windows normally only use memory. That does not normally cause me any issue.

I am running out of ideas except to say it keeps coming back to software (not hardware) based on everything you say.
Avatar of westes

ASKER

I did try booting off of an external SATA drive instead of the internal hardware RAID, and I get the same slowdowns.   At this point I am starting to believe that it is a bottleneck inside of Chrome, but I would like to prove that.

"Re-installing Windows" is not a good solution to any problem, *until there is proof* that the underlying OS is corrupted.  It is not reasonable to commit someone to 100 hours of work to recreate an environment from scratch.
When your computer starts, start TaskManager and under Processes, ensure you have disk listed as a column (if you are only seeing CPU, Memory and Network, then right click and click on Disk).  Click on Disk to ensure it sorts it by disk utilization and I believe it is a disk issue.  Let the system do it's thing and look at the disk.  You could also use the performance tab of Task Manager.  If it shows disk then at least you could find the process which is causing the issue.
Avatar of westes

ASKER

Mohammed, on Windows 8.1 Task Manager, the tab with processes is named "Details".

When I go to "Select Columns" there are about 10 different disk related concepts, corresponding to bytes in and out and throughput issues.  Which of these would you like me to select?  

On the Performance tab, this is a global view of disk utilization and not per process.

If in fact the hardware is failing, how would this show up in either view?   The system would go to do an I/O, and the hardware would just take a long time to reply within the device driver itself.   All of that delay would apparently be invisible to either of the tabs you are recommending?   The Performance tab would see the disk as underutilized.   The Details tab would show very few IOs to the disk that is failing.   Where and how does failing hardware show up here?

And, as I stated earlier, I did make an image copy of the hardware RAID boot device to an eSATA drive and then booted from that drive (I also physically removed the hardware RAID drives so there was no possibility of a crossover or mixed-boot situation).   I got the same delays.  

I am finding that I can temporarily reset the system simply by exiting the browser and restarting all of its contexts from scratch.   So I'm still thinking this might be a problem within Chrome itself, but I don't know how to explore performance bottlenecks within Chrome.
What  virus scanner are you using?
Consider uninstalling it and test a while. Be careful what you're doing while more or less unprotected.
Avatar of westes

ASKER

I was using Webroot SecureAnywhere, which had been the source of various issues and I did disable it.   It doesn't seem to affect the slowdown that I described in this post.
I don't know SecureAnywhere. I've seen other AV software where just disable didn't help. We really had to uninstall it.
did you test that with another keyboard and mouse?
which one are you using?  wireless, usb or PS/2?  if possible - test with ps/2 mouse
Avatar of westes

ASKER

nobus, if there were a hardware failure on keyboard or mouse, wouldn't I see this in the System event viewer as excessive numbers of interrupts?  I did try substituting another USB keyboard and mouse, but no change in the behavior.   Unfortunately, I no longer have a PS2 keyboard or mouse.
it all depends on the kind of failure - no software is 100%, so i don't rely on it - i test it
maybe worth looking for a ps/2 couple
Avatar of westes

ASKER

What I was hoping to get as the outcome of this question was a procedure for using performance monitor, in order to determine whether the slowdown was coming from a specific process on the operating system, or alternately was there something happening in the kernel (presumably a driver) that was not releasing control.  

If it is impossible to detect a hung driver - as might happen when hardware is failing and the driver is waiting for a hardware response that times out - then simply knowing that Windows has no way to detect such an event would also be a useful answer to this question.

The request to close this question states that there is not enough information to provide an answer.   Yet there are no questions that were asked that were not answered.  I tried to clarify in the first two paragraphs of this comment in more detail what I am trying to get from this question.

I was not abandoning the question.
Avatar of westes

ASKER

The most productive line of investigation - which springs from this answer - is that I may have hardware issues and the hardware device driver may be the source of the system freezing.   I will continue to investigate that possibility.