Link to home
Start Free TrialLog in
Avatar of Andrew N. Kowtalo
Andrew N. Kowtalo

asked on

VM keeps Crashing and rebooting and coming back online.

Hi All.  

This particular VM we are running is reporting HD Controller failures.   Since it is a VM which has many VM's running off the box is there a way to determine whether or not the failure is the physical hard disk on the server or something else reporting?  Here is the error.

The driver detected a controller error on \Device\Harddisk0\DR0.
The driver detected a controller error on \Device\Harddisk2\DR1
The driver detected a controller error on \Device\Harddisk2\DR2.


+ System

  - Provider

   [ Name]  disk
 
  - EventID 11

   [ Qualifiers]  49156
 
   Level 2
 
   Task 0
 
   Keywords 0x80000000000000
 
  - TimeCreated

   [ SystemTime]  2020-03-30T15:00:03.147781900Z
 
   EventRecordID 2202132
 
   Channel System
 
   Computer TMMSage.tmmontante.local
 
   Security
 

- EventData

   \Device\Harddisk0\DR0
   1004800001000000000000000B0004C0030100000000000000000000000000000000000000000000BA20000000000000FFFFFFFF060000005800072200000000E42000000201200000000000F000000000000000000000000000000000000000000000000000000080A78E1600E0FFFF0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000


--------------------------------------------------------------------------------

Binary data:


In Words

0000: 00800410 00000001 00000000 C004000B
0010: 00000103 00000000 00000000 00000000
0020: 00000000 00000000 000020BA 00000000
0030: FFFFFFFF 00000006 22070058 00000000
0040: 000020E4 00200102 00000000 000000F0
0050: 00000000 00000000 00000000 00000000
0060: 00000000 00000000 168EA780 FFFFE000
0070: 00000000 00000000 00000000 00000000
0080: 00000000 00000000 00000000 00000000
0090: 00000000 00000000 00000000 00000000
00a0: 00000000 00000000  


In Bytes

0000: 10 04 80 00 01 00 00 00   ..€.....
0008: 00 00 00 00 0B 00 04 C0   .......À
0010: 03 01 00 00 00 00 00 00   ........
0018: 00 00 00 00 00 00 00 00   ........
0020: 00 00 00 00 00 00 00 00   ........
0028: BA 20 00 00 00 00 00 00   º ......
0030: FF FF FF FF 06 00 00 00   ÿÿÿÿ....
0038: 58 00 07 22 00 00 00 00   X.."....
0040: E4 20 00 00 02 01 20 00   ä .... .
0048: 00 00 00 00 F0 00 00 00   ....ð...
0050: 00 00 00 00 00 00 00 00   ........
0058: 00 00 00 00 00 00 00 00   ........
0060: 00 00 00 00 00 00 00 00   ........
0068: 80 A7 8E 16 00 E0 FF FF   €§Ž..àÿÿ
0070: 00 00 00 00 00 00 00 00   ........
0078: 00 00 00 00 00 00 00 00   ........
0080: 00 00 00 00 00 00 00 00   ........
0088: 00 00 00 00 00 00 00 00   ........
0090: 00 00 00 00 00 00 00 00   ........
0098: 00 00 00 00 00 00 00 00   ........
00a0: 00 00 00 00 00 00 00 00   ........


Not sure if there is anything that we can see within the VM itself or if it is physical failure what the cause is.  Currently its mid workday so we have no way to shut this server down right now but it keeps crashing, rebooting then coming back up kicking multiple users off Sage.    This is running older 2012 R2.    IF you need anything else let me know.

Any assistance is appreciated
Avatar of Seth Simmons
Seth Simmons
Flag of United States of America image

what do you have for storage?  spinning disk?  raid?  how many other guests are running on the host?  is there a lot of i/o load from other guests?
maybe there is some storage timeout because of being overloaded
Avatar of Andrew N. Kowtalo
Andrew N. Kowtalo

ASKER

Hi Seth let me speak with the engineer I will get back to you shortly.
Sam its a SAN Spinning disk running 4 VM's in an RDS environment..   We are not sure about the I/O load.   There are a lot of users running Sage along with Timberscan within the environment as well.    Where could we see if an overload is happening? 
can you move the vm to another host or move the storage to another location?  This can be done without shutting down the vm
Dave I will check with our level 3s if that can be done.   
so the host is running windows server 2012 hyper-v? can you run the mfg disk tools or some SMART tool that will tell you? guessing from inside the VM is pretty tough.
Hi Aaron,

One thing we noticed was the Datto backup agent was running at the same time as everyone else working inside the environment.  We paused the local backup agent and the server has stabilized so it may be something is overloading it.   I think running that tool may be the next step.   We're going to allow 24 hours to see if the server stabilizes.  Maybe perhaps scheduling Datto backups after hours will fix the issue however incremental backups will not happen then.   This is a real problem.  
is Datto agent running inside the vm or on the host?
I believe both.   Because the host has data along with the VM's that are running.
well backups are super disk intensive, but you definitely found the culprit. perhaps datto can look at your config and see if they can tune it down or something.
ASKER CERTIFIED SOLUTION
Avatar of Andrew N. Kowtalo
Andrew N. Kowtalo

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial