Andrew N. Kowtalo
asked on
VM keeps Crashing and rebooting and coming back online.
Hi All.
This particular VM we are running is reporting HD Controller failures. Since it is a VM which has many VM's running off the box is there a way to determine whether or not the failure is the physical hard disk on the server or something else reporting? Here is the error.
The driver detected a controller error on \Device\Harddisk0\DR0.
The driver detected a controller error on \Device\Harddisk2\DR1
The driver detected a controller error on \Device\Harddisk2\DR2.
+ System
- Provider
[ Name] disk
- EventID 11
[ Qualifiers] 49156
Level 2
Task 0
Keywords 0x80000000000000
- TimeCreated
[ SystemTime] 2020-03-30T15:00:03.147781 900Z
EventRecordID 2202132
Channel System
Computer TMMSage.tmmontante.local
Security
- EventData
\Device\Harddisk0\DR0
1004800001000000000000000B 0004C00301 0000000000 0000000000 0000000000 0000000000 0000BA2000 0000000000 FFFFFFFF06 0000005800 0722000000 00E4200000 0201200000 000000F000 0000000000 0000000000 0000000000 0000000000 0000000000 0080A78E16 00E0FFFF00 0000000000 0000000000 0000000000 0000000000 0000000000 0000000000 0000000000 0000000000 0000000000 0000000000 0000000000
-------------------------- ---------- ---------- ---------- ---------- ---------- ----
Binary data:
In Words
0000: 00800410 00000001 00000000 C004000B
0010: 00000103 00000000 00000000 00000000
0020: 00000000 00000000 000020BA 00000000
0030: FFFFFFFF 00000006 22070058 00000000
0040: 000020E4 00200102 00000000 000000F0
0050: 00000000 00000000 00000000 00000000
0060: 00000000 00000000 168EA780 FFFFE000
0070: 00000000 00000000 00000000 00000000
0080: 00000000 00000000 00000000 00000000
0090: 00000000 00000000 00000000 00000000
00a0: 00000000 00000000
In Bytes
0000: 10 04 80 00 01 00 00 00 .......
0008: 00 00 00 00 0B 00 04 C0 .......À
0010: 03 01 00 00 00 00 00 00 ........
0018: 00 00 00 00 00 00 00 00 ........
0020: 00 00 00 00 00 00 00 00 ........
0028: BA 20 00 00 00 00 00 00 º ......
0030: FF FF FF FF 06 00 00 00 ÿÿÿÿ....
0038: 58 00 07 22 00 00 00 00 X.."....
0040: E4 20 00 00 02 01 20 00 ä .... .
0048: 00 00 00 00 F0 00 00 00 ....ð...
0050: 00 00 00 00 00 00 00 00 ........
0058: 00 00 00 00 00 00 00 00 ........
0060: 00 00 00 00 00 00 00 00 ........
0068: 80 A7 8E 16 00 E0 FF FF §..àÿÿ
0070: 00 00 00 00 00 00 00 00 ........
0078: 00 00 00 00 00 00 00 00 ........
0080: 00 00 00 00 00 00 00 00 ........
0088: 00 00 00 00 00 00 00 00 ........
0090: 00 00 00 00 00 00 00 00 ........
0098: 00 00 00 00 00 00 00 00 ........
00a0: 00 00 00 00 00 00 00 00 ........
Not sure if there is anything that we can see within the VM itself or if it is physical failure what the cause is. Currently its mid workday so we have no way to shut this server down right now but it keeps crashing, rebooting then coming back up kicking multiple users off Sage. This is running older 2012 R2. IF you need anything else let me know.
Any assistance is appreciated
This particular VM we are running is reporting HD Controller failures. Since it is a VM which has many VM's running off the box is there a way to determine whether or not the failure is the physical hard disk on the server or something else reporting? Here is the error.
The driver detected a controller error on \Device\Harddisk0\DR0.
The driver detected a controller error on \Device\Harddisk2\DR1
The driver detected a controller error on \Device\Harddisk2\DR2.
+ System
- Provider
[ Name] disk
- EventID 11
[ Qualifiers] 49156
Level 2
Task 0
Keywords 0x80000000000000
- TimeCreated
[ SystemTime] 2020-03-30T15:00:03.147781
EventRecordID 2202132
Channel System
Computer TMMSage.tmmontante.local
Security
- EventData
\Device\Harddisk0\DR0
1004800001000000000000000B
--------------------------
Binary data:
In Words
0000: 00800410 00000001 00000000 C004000B
0010: 00000103 00000000 00000000 00000000
0020: 00000000 00000000 000020BA 00000000
0030: FFFFFFFF 00000006 22070058 00000000
0040: 000020E4 00200102 00000000 000000F0
0050: 00000000 00000000 00000000 00000000
0060: 00000000 00000000 168EA780 FFFFE000
0070: 00000000 00000000 00000000 00000000
0080: 00000000 00000000 00000000 00000000
0090: 00000000 00000000 00000000 00000000
00a0: 00000000 00000000
In Bytes
0000: 10 04 80 00 01 00 00 00 .......
0008: 00 00 00 00 0B 00 04 C0 .......À
0010: 03 01 00 00 00 00 00 00 ........
0018: 00 00 00 00 00 00 00 00 ........
0020: 00 00 00 00 00 00 00 00 ........
0028: BA 20 00 00 00 00 00 00 º ......
0030: FF FF FF FF 06 00 00 00 ÿÿÿÿ....
0038: 58 00 07 22 00 00 00 00 X.."....
0040: E4 20 00 00 02 01 20 00 ä .... .
0048: 00 00 00 00 F0 00 00 00 ....ð...
0050: 00 00 00 00 00 00 00 00 ........
0058: 00 00 00 00 00 00 00 00 ........
0060: 00 00 00 00 00 00 00 00 ........
0068: 80 A7 8E 16 00 E0 FF FF §..àÿÿ
0070: 00 00 00 00 00 00 00 00 ........
0078: 00 00 00 00 00 00 00 00 ........
0080: 00 00 00 00 00 00 00 00 ........
0088: 00 00 00 00 00 00 00 00 ........
0090: 00 00 00 00 00 00 00 00 ........
0098: 00 00 00 00 00 00 00 00 ........
00a0: 00 00 00 00 00 00 00 00 ........
Not sure if there is anything that we can see within the VM itself or if it is physical failure what the cause is. Currently its mid workday so we have no way to shut this server down right now but it keeps crashing, rebooting then coming back up kicking multiple users off Sage. This is running older 2012 R2. IF you need anything else let me know.
Any assistance is appreciated
ASKER
Hi Seth let me speak with the engineer I will get back to you shortly.
ASKER
Sam its a SAN Spinning disk running 4 VM's in an RDS environment.. We are not sure about the I/O load. There are a lot of users running Sage along with Timberscan within the environment as well. Where could we see if an overload is happening?
can you move the vm to another host or move the storage to another location? This can be done without shutting down the vm
ASKER
Dave I will check with our level 3s if that can be done.
so the host is running windows server 2012 hyper-v? can you run the mfg disk tools or some SMART tool that will tell you? guessing from inside the VM is pretty tough.
ASKER
Hi Aaron,
One thing we noticed was the Datto backup agent was running at the same time as everyone else working inside the environment. We paused the local backup agent and the server has stabilized so it may be something is overloading it. I think running that tool may be the next step. We're going to allow 24 hours to see if the server stabilizes. Maybe perhaps scheduling Datto backups after hours will fix the issue however incremental backups will not happen then. This is a real problem.
One thing we noticed was the Datto backup agent was running at the same time as everyone else working inside the environment. We paused the local backup agent and the server has stabilized so it may be something is overloading it. I think running that tool may be the next step. We're going to allow 24 hours to see if the server stabilizes. Maybe perhaps scheduling Datto backups after hours will fix the issue however incremental backups will not happen then. This is a real problem.
is Datto agent running inside the vm or on the host?
ASKER
I believe both. Because the host has data along with the VM's that are running.
well backups are super disk intensive, but you definitely found the culprit. perhaps datto can look at your config and see if they can tune it down or something.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
maybe there is some storage timeout because of being overloaded