Ultramarathonman
asked on
BSOD every hour
I have a Windows Server 2008 R2 w/ SP1 that started BSOD every hour. Attached is a photo of the blue screen. Here are the details.
When the BSOD happens, a memory dump cannot write to the disk because the disk is not available (I believe).
When the BSOD happens, you have to do a cold boot because if you do a warm boot the boot Raid mirror is not found.
The Adaptec controller drivers were downloaded and updated from the manufacturer (SuperMicro) but that didn't help.
Several hotfixes pertaining to storport.sys were installed but that didn't help.
Opened a ticket with Microsoft and spent 6 hours on the phone with them to no avail.
The Microsoft tech used a program called "NotMyFault" or something like that to cause the server to crash and create a BSOD. The purpose of that was to see if a dump file would be created. It did create the dump file successfully. That confirms in my opinion my previous thought that the problem may be the onboard controller stopping to function. Like a memory leak but I installed the hotfix pertaining to memory leaks and storport.sys
Anything I try requires me to wait for the 60-62 minute time frame to expire before I know if it worked.
When the BSOD happens, a memory dump cannot write to the disk because the disk is not available (I believe).
When the BSOD happens, you have to do a cold boot because if you do a warm boot the boot Raid mirror is not found.
The Adaptec controller drivers were downloaded and updated from the manufacturer (SuperMicro) but that didn't help.
Several hotfixes pertaining to storport.sys were installed but that didn't help.
Opened a ticket with Microsoft and spent 6 hours on the phone with them to no avail.
The Microsoft tech used a program called "NotMyFault" or something like that to cause the server to crash and create a BSOD. The purpose of that was to see if a dump file would be created. It did create the dump file successfully. That confirms in my opinion my previous thought that the problem may be the onboard controller stopping to function. Like a memory leak but I installed the hotfix pertaining to memory leaks and storport.sys
Anything I try requires me to wait for the 60-62 minute time frame to expire before I know if it worked.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Here is the file
photo--6-.JPG
photo--6-.JPG
If the RAID is disappearing from the BIOS, good chance the card is failing. It happens, but this also means good chance that if you boot the system up to ubuntu then it will fail there.
This is consistent with controller failing. Go to the adaptec BIOS and make sure that the RAID1 is still optimal, or isn't in some degraded mode.
This is consistent with controller failing. Go to the adaptec BIOS and make sure that the RAID1 is still optimal, or isn't in some degraded mode.
ASKER
Bios shows Raid1 is still healthy and happy.
I'd boot a unix distribution and see if the system continues to die. This could be software or hardware, and when you have such a message, it could really be either. Just knowing where NOT to look makes life so much easier.
ASKER
Ubuntu live cd loaded and ran stable for several hours. I then installed server 2008 r2 on a new disk and it has now run for 18 hours straight. Not sure if it was a problem with one of the SSD drives or a windows problem. Working now though. Thanks for the suggestion.
ASKER