Rick_Penney
asked on
HyperV server intermittent unreachable via RDP, Dameware etc
We have a HP DL360 Gen 9 running HyperV on Windows 2012 R2. Approx. 1-2 times per month the server will be unreachable via remote software (RDP or Dameware).
Prtg monitoring will show all sensors as Red. The server still responds to ping.
During this occurrence, all VM Guests remain fully accessible and function normally.
Using the Servers ILO remote console, we can see the Welcome screen where you would press Ctrl, Alt, Delete, but we are unable to send any remote hot keys or send Ctrl, Alt, Delete from the ILO menu's.
The only option we have is to remote onto the VM Guests and power them down, and then reboot the Physical server via the ILO
We have another 2 identical servers at other sites that are on the same latest SPP and configured the same way that are unaffecting, so drivers/firmware doesn't seem to be the issue.
Looking at when PRTG monitoring first flags an issue is usually around 03.00am, the only thing doing any work at that time is Veritas Backup Exec
The windows event log for that time frame, lists Event ID 3, Filter manager failed to attach to volume \Device\Harddisk2\DR183. The final status was 0xC3A001C
After a reboot and the server comes back up, everything works but we still get the filtering error.
The mentioned Event ID3 may be a red herring as we also get the event on the other two working servers
regards, Rick
Prtg monitoring will show all sensors as Red. The server still responds to ping.
During this occurrence, all VM Guests remain fully accessible and function normally.
Using the Servers ILO remote console, we can see the Welcome screen where you would press Ctrl, Alt, Delete, but we are unable to send any remote hot keys or send Ctrl, Alt, Delete from the ILO menu's.
The only option we have is to remote onto the VM Guests and power them down, and then reboot the Physical server via the ILO
We have another 2 identical servers at other sites that are on the same latest SPP and configured the same way that are unaffecting, so drivers/firmware doesn't seem to be the issue.
Looking at when PRTG monitoring first flags an issue is usually around 03.00am, the only thing doing any work at that time is Veritas Backup Exec
The windows event log for that time frame, lists Event ID 3, Filter manager failed to attach to volume \Device\Harddisk2\DR183. The final status was 0xC3A001C
After a reboot and the server comes back up, everything works but we still get the filtering error.
The mentioned Event ID3 may be a red herring as we also get the event on the other two working servers
regards, Rick
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
If the unresponsiveness starts at 0300 when BUE runs then there's a correlation there that points to the culprit.
Veeam is free for 10 VMs or less. I suggest going that route if possible.
Veeam is free for 10 VMs or less. I suggest going that route if possible.
ASKER
Thanks Philip, we did previously search the Veritas forums with no joy, but I may delete the backup jobs and create some new ones from scratch. I will also log a ticket with Veritas.
Thanks for the heads up ref Veeam, I'll look into that tomorrow
Very much appreciated
Rick
Thanks for the heads up ref Veeam, I'll look into that tomorrow
Very much appreciated
Rick
One of the first things to do is to change the schedule. If the lock-up follows the schedule change things are pretty clear.
ASKER
Yes, makes sense, Good shout.
kind regards
Rick
kind regards
Rick
Double and triple check that you are running the latest NIC firmware and drivers from HPE. The Microsoft NIC driver WILL causes these problems.
Also be sure that VMQ is disabled on 1Gb NICs.
https://www.dell.com/support/article/us/en/04/sln132131/windows-server-slow-network-performance-on-hyper-v-virtual-machines-with-virtual-machine-queue-vmq-enabled?lang=en
Also be sure that VMQ is disabled on 1Gb NICs.
https://www.dell.com/support/article/us/en/04/sln132131/windows-server-slow-network-performance-on-hyper-v-virtual-machines-with-virtual-machine-queue-vmq-enabled?lang=en
ASKER
Thanks guys for all your comments, @kevinhsieh, the NIC drivers (HPE Ethernet 1GB 4-port 331li) are from HPs latest SPP but are dated 01/08/2018. I'll see if I can find a later driver.
With ref to the VMQ, the NICS were already set to disabled, not a setting I was aware of, so thanks for the info.
I'll report back in a couple of weeks, ref the drivers and rescheduling Backup Exec, i'll also take a look into Veeam.
kindest regards
With ref to the VMQ, the NICS were already set to disabled, not a setting I was aware of, so thanks for the info.
I'll report back in a couple of weeks, ref the drivers and rescheduling Backup Exec, i'll also take a look into Veeam.
kindest regards
VMQ and more in my articles here:
Some Hyper-V Hardware and Software Best Practices
Practical Hyper-V Performance Expectations
Some Hyper-V Hardware and Software Best Practices
Practical Hyper-V Performance Expectations
ASKER
Your very kind, thanks for all your help with this, i'll check the links out today.
Not sure if I can mark your comments as "My Solution" or joint solution as I initially set this for the first Expert who replied. I'll give it a go.
Not sure if I can mark your comments as "My Solution" or joint solution as I initially set this for the first Expert who replied. I'll give it a go.
ASKER
I guess I wont know for a month so fingers crossed
kind regards
Rick