Link to home
Start Free TrialLog in
Avatar of Rick_Penney
Rick_Penney

asked on

HyperV server intermittent unreachable via RDP, Dameware etc

We have a HP DL360 Gen 9 running HyperV on Windows 2012 R2. Approx. 1-2 times per month the server will be unreachable via remote software (RDP or Dameware).
Prtg monitoring will show all sensors as Red. The server still responds to ping.
During this occurrence, all VM Guests remain fully accessible and function normally.

Using the Servers ILO remote console, we can see the Welcome screen where you would press Ctrl, Alt, Delete, but we are unable to send any remote hot keys or send Ctrl, Alt, Delete from the ILO menu's.

The only option we have is to remote onto the VM Guests and power them down, and then reboot the Physical server via the ILO

We have another 2 identical servers at other sites that are on the same latest SPP and configured the same way that are unaffecting, so drivers/firmware doesn't seem to be the issue.

Looking at when PRTG monitoring first flags an issue is usually around 03.00am, the only thing doing any work at that time is Veritas Backup Exec
The windows event log for that time frame, lists Event ID 3, Filter manager failed to attach to volume \Device\Harddisk2\DR183. The final status was 0xC3A001C

After a reboot and the server comes back up, everything works but we still get the filtering error.

The mentioned Event ID3 may be a red herring as we also get the event on the other two working servers

regards, Rick
ASKER CERTIFIED SOLUTION
Avatar of Michael Pfister
Michael Pfister
Flag of Germany image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of Rick_Penney
Rick_Penney

ASKER

Thanks Michael, very much appreciated, i'll apply the changes now and schedule a reboot tonight.
I guess I wont know for a month so fingers crossed
kind regards
Rick
If the unresponsiveness starts at 0300 when BUE runs then there's a correlation there that points to the culprit.

Veeam is free for 10 VMs or less. I suggest going that route if possible.
Thanks Philip, we did previously search the Veritas forums with no joy, but I may delete the backup jobs and create some new ones from scratch. I will also log a ticket with Veritas.
Thanks for the heads up ref Veeam, I'll look into that tomorrow
Very much appreciated
Rick
One of the first things to do is to change the schedule. If the lock-up follows the schedule change things are pretty clear.
Yes, makes sense, Good shout.
kind regards
Rick
Double and triple check that you are running the latest NIC firmware and drivers from HPE. The Microsoft NIC driver WILL causes these problems.

Also be sure that VMQ is disabled on 1Gb NICs.
https://www.dell.com/support/article/us/en/04/sln132131/windows-server-slow-network-performance-on-hyper-v-virtual-machines-with-virtual-machine-queue-vmq-enabled?lang=en
Thanks guys for all your comments, @kevinhsieh, the NIC drivers (HPE Ethernet 1GB 4-port 331li) are from HPs latest SPP but are dated 01/08/2018. I'll see if I can find a later driver.
With ref to the VMQ, the NICS were already set to disabled, not a setting I was aware of, so thanks for the info.
I'll report back in a couple of weeks, ref the drivers and rescheduling Backup Exec, i'll also take a look into Veeam.
kindest regards
Your very kind, thanks for all your help with this, i'll check the links out today.
Not sure if I can mark your comments as "My Solution" or joint solution as I initially set this for the first Expert who replied. I'll give it a go.