Link to home
Start Free TrialLog in
Avatar of bullfrog264
bullfrog264

asked on

Hyper-V Host 2012 R2 Crashes on Cisco UCS B series blade

I am experiencing instability issues with one of my Hyper-V host running 2012 R2 on a Cisco UCS B200 M4(firmware 2.2.3d).  It is accessing an SMB 3.0(CIFS) share on a NetApp 8040 in Clustered mode and also has access to an iSCSI CSV volume on the same NetApp 8040.  Normally I don't see any stability issues although the performance is much slower than I think it should be....which is whole seperate post.  I noticed the issue when performing a Citrix deployment.  The mirrored SQL servers kept failing over to the mirror in our DR site for no apparent reason.  The problem became apparent when trying to copy a large file from one folder to another on the same VHDX.

1.  I opened an RDP session to VM1 which is a 2012 R2 guest os with all VHDX residing on the same volume(SMB Share).
2.  I copied two VHD files from e:\Vhds to e:\backup on the same server from a remote desktop session.
3.  Shortly into the file copy the RDP session dropped.  Upon further investigation I discovered the host had failed and all VMs had been failed over to the other host in the cluster.  I checked network connectivity to the host but all network interfaces on the host were down.
4.  I opened the KVM on Cisco UCS and verified the server was running but all of the network adapters were showing disconnected or network unavailable.  I was forced to reboot the host and it came up properly.
5.  I reviewed the windows logs but didn't see any error indicating why the NICs had become disconnected.  In UCS it showed ALL of the VICs associated with the blade became disconnected at the same time.
6.  I verified I have the correct Cisco Drivers installed on the host for my firmware version.
7.  I was able to repeat the issue on all of my hyper-v hosts in the blade chassis when the storage was on SMB.
8.  I moved the storage to iSCSI and the issue went away although the performance was not satisfactory.
9.  I have a very similiar setup at my DR and while the performance again is not great I am unable to repeat the issue.

Both sites are running the same firmware and the same patch level at the host.  I am fairly certain my performance issues will be tracked down to a misconfiguration.  The host crashing I don't believe is.  I just don't know where to go from this point.  I am preparing to open a case with NetApp and Cisco.  I verified the configuration for the SMB share on the NetApp is identical in both locations and follows their best practices for Hyper-V SMB shares.  I haven't been able to find anything on google or the Cisco support forums yet.  I would appreciate any suggestions.
Avatar of Philip Elder
Philip Elder
Flag of Canada image

Is there a DMP file in C:\Windows or C:\Windows\MiniDump ?
Avatar of bullfrog264
bullfrog264

ASKER

There was a minidump and dump file but they didn't correspond with the date or time of the incident.
I suggest reseating the node. Make sure it is properly seated in its slot. Slim chance there.

If the "physical" ports to the node are going offline is there a configuration issue going on in the chassis networking setup?
SOLUTION
Avatar of bullfrog264
bullfrog264

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
ASKER CERTIFIED SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
I finally resolved the issue by disabling settings based on Cisco TAC recommendations.