Link to home
Start Free TrialLog in
Avatar of KhaiPi
KhaiPiFlag for Australia

asked on

Windows 10 files corrupted & boot issues

This looks like one for the gurus and/or MVPs...

I have a very heavy (2 x Xeons, loads of RAM, 4 x GPUs) that I shut down every night and then reboot first thing the next morning.

Curiously, this first "cold" boot can take as little as 20 seconds.  However, 9 times out of 10, once I have logged in, Windows 10 is basically unusable in that no Microsoft applications will launch (Office, Edge etc.) and many other applications try to start but freeze and eventually I have to go into the Task Manager and restart Explorer. This doesn't solve the problems but it does permit me to do a "Restart".

This "warm" restart/boot can take up to 10 minutes, but 99 times out of a 100, Windows will then function quite normally (or at least appear to) and be responsive with all applications able to launch.

So, my first question is why does a cold boot happen so quickly but lead to an unusable state of the OS whereas a restart takes a very long time but leads to a very usable state of the OS?

In my attempts to find a solution, I found links to the sfc command and certain DISM commands.  Running these tells me that some Windows files are corrupted but I don't know how to "uncorrupt" them.

Running the command sfc /scannow produces the following message:

Windows Resource Protection found corrupt files but was unable to fix some
of them. Details are included in the CBS.Log windir\Logs\CBS\CBS.log. For
example C:\Windows\Logs\CBS\CBS.log. Note that logging is currently not
supported in offline servicing scenarios.
I have attached the CBS.log file but it means very little to me.

Next, I ran the command Dism /Online /Cleanup-Image /CheckHealth and the output is:

Deployment Image Servicing and Management tool
Version: 10.0.10586.0

Image Version: 10.0.10586.0

The component store is repairable.
The operation completed successfully.
Next I ran the command Dism /Online /Cleanup-Image /ScanHealth and the output is exactly the same as the previous command.

Next I ran the command Dism /Online /Cleanup-Image /StartComponentCleanup and the output is:

Deployment Image Servicing and Management tool
Version: 10.0.10586.0

Image Version: 10.0.10586.0

[===========                20.0%                          ]
The operation completed successfully.
Finally, I ran the command Dism /Online /Cleanup-Image /RestoreHealth and the output is:

Deployment Image Servicing and Management tool
Version: 10.0.10586.0

Image Version: 10.0.10586.0

[==========================100.0%==========================]

Error: 0x800f081f

The source files could not be found.
Use the "Source" option to specify the location of the files that are required to restore the feature. For more information on specifying a source location, see http://go.microsoft.com/fwlink/?LinkId=243077.

The DISM log file can be found at C:\WINDOWS\Logs\DISM\dism.log
I have also attached the dism.log file but, again, it is all just gibberish to me.

So, not really understanding what these messages and logs were actually trying to tell me, I tried several options to "repair" Windows 10 including inserting the original Windows 10 DVD and various options in Settings but, in all cases, the "best" option that seems to be available to me is to basically reinstall the OS, lose all my application but possibly (at least) retain my settings.

Given that I have a large number of applications installed and the time it would take to reinstall the OS and then manually reinstall every application would most like be far too long to be practical when I am trying to run a business off this machine at the same time, I really need help or suggestions as to:

1) What exactly is wrong with the OS?
2) Is there a way to fix the issues without having to either reinstall the OS or all the applications?

Thanks,

KP
CBS.log
dism.log
Avatar of Alan Henderson
Alan Henderson
Flag of New Zealand image

SFC is a great tool but if it doesn't get results, it's worth running the free version of  CCleaner:

https://www.piriform.com/ccleaner

and chkdsk

https://windowsinstructed.com/run-chkdsk-windows/

After that, uninstall any applications that you don't need.
ASKER CERTIFIED SOLUTION
Avatar of Cliff Galiher
Cliff Galiher
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of KhaiPi

ASKER

Thanks Alan.  I ran both chkdsk on drives C: and D: and CCleaner (which I already run regularly) and no issues were reported by either tool.
If Cliff's excellent advice doesn't help, I fear that you'll have to bite the bullet and reinstall Windows.

:o(
Does the slow boot happen in safe mode ?

Possibility a bad piece of hardware. Try removing non essential add on cards one at a time.  Possibility you may have an excessive page file or corruption.  You could try setting page file to manual with say 2048 size page file. If that doesn't work turn off hibernation.  Agree with other techs to update your drivers for add on cards. Try UPHClean program from Microsoft.  My two cents.
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Ditto rindi, the problem is seemingly in expanding the hiberfile. Have you checked if all drivers are up-to-date?
Do Windows Events show any error or warning?
you say
"Given that I have a large number of applications installed and the time it would take to reinstall the OS and then manually reinstall every application would most like be far too long to be practical when I am trying to run a business off this machine at the same time, I really need help or suggestions as to:"

in such a case, you should have an image backup of the drive, so you can restore it quickly, with all applications installed

now, how long ago was the upgrade to windows10 ? if less than a month, you can do the rollback to windows 7, and then decide how to proceed
If you get a 2nd hard disk, and do a fresh install of w10 to it (with the old disk disconnected), you should be able to quickly diagnose whether it's a hardware or software issue.  That way, you can quickly revert back to the old "work" disk when you need to during this testing phase.  

If it is hardware, you can then test further by selectively removing hardware (RAM and GPUs, primarily) until your fault is pinpointed.  "Luckily", your fault is very repeatable.  

If it is a software issue, Belarc is very useful to audit your old environment - get serials, etc.  
http://www.belarc.com/free_download.html
From the fault as described, I would suspect a driver issue.  Is there anything marked in Device Manager, or entries in the Event Log?  If you're desperate not to reinstall, then getting updated versions and reinstalling all of them would be an idea.
Avatar of KhaiPi

ASKER

OK, thanks to everyone who offered solutions.

I have waited to accept any solution(s) because I wanted to make sure first that the problem had been actually resolved.

What I have found is that Cliff Galiher's suggestion of disabling "fast boot" seems to have (mostly) resolved the issue.  I ran the PowerCfg tool to do this.

Since then, every cold boot has taken just as long as a restart and, almost every time, the OS was immediately responsive and behaving itself.

The caveat is that, on rare occasions, when it boots up there is no internet access and "Troubleshooting" tells me that "Certain protocols required for internet access are missing from this machine" and then selecting the option to fix this problem fails.  And yet, when I restart, or on most other occasions when I do a cold boot, I have no problems with internet access, which suggests to me that those "missing protocols" are not really "missing".

If someone can explain a reason for why this might be happening and ideally find a way to fix it, then I will select Cliff's solution as the official answer (or perhaps as one of multiple answers if this internet access issue is something related).  If it is determined to be unrelated, then Cliff will get all the points.

Cheers,

KP
i also have had troubles with drivers that were not working after a win10 upgrade
it looks like windows 10 is changing a lot of drivers - and has caused many problems by doing so
i found it necessary to do a fresh install

if  you care to see my case : https://www.experts-exchange.com/questions/28941608/No-internet-after-Windows10-update-and-restore-to-win7.html
Windows will not (cannot) load some parts of the TCP/IP stack if a NIC driver fails to load. As such, the error, as it were, is accurate. The protocol is missing and cannot be fixed/loaded, because a dependent driver/service failed. Rebooting allows the deep dent driver to load, and then the protocol loads, and thus is no longer missing. The two issues may very well be related. You kay have a NIC with bad drivers, making then unstable, prone to crashing, and prone to load failures during power up or resume.
Avatar of KhaiPi

ASKER

Thanks guys.

The problem ultimately turned out to be due to multiple hardware failures including the motherboard itself and 1 of the 8 DIMMs.

The advice given by the two people I have awarded points to helped me the most in finally getting the the root cause.

Cheers,

KP