Hyper-V Host 2012 R2 Crashes on Cisco UCS B series blade

I am experiencing instability issues with one of my Hyper-V host running 2012 R2 on a Cisco UCS B200 M4(firmware 2.2.3d).  It is accessing an SMB 3.0(CIFS) share on a NetApp 8040 in Clustered mode and also has access to an iSCSI CSV volume on the same NetApp 8040.  Normally I don't see any stability issues although the performance is much slower than I think it should be....which is whole seperate post.  I noticed the issue when performing a Citrix deployment.  The mirrored SQL servers kept failing over to the mirror in our DR site for no apparent reason.  The problem became apparent when trying to copy a large file from one folder to another on the same VHDX.

1.  I opened an RDP session to VM1 which is a 2012 R2 guest os with all VHDX residing on the same volume(SMB Share).
2.  I copied two VHD files from e:\Vhds to e:\backup on the same server from a remote desktop session.
3.  Shortly into the file copy the RDP session dropped.  Upon further investigation I discovered the host had failed and all VMs had been failed over to the other host in the cluster.  I checked network connectivity to the host but all network interfaces on the host were down.
4.  I opened the KVM on Cisco UCS and verified the server was running but all of the network adapters were showing disconnected or network unavailable.  I was forced to reboot the host and it came up properly.
5.  I reviewed the windows logs but didn't see any error indicating why the NICs had become disconnected.  In UCS it showed ALL of the VICs associated with the blade became disconnected at the same time.
6.  I verified I have the correct Cisco Drivers installed on the host for my firmware version.
7.  I was able to repeat the issue on all of my hyper-v hosts in the blade chassis when the storage was on SMB.
8.  I moved the storage to iSCSI and the issue went away although the performance was not satisfactory.
9.  I have a very similiar setup at my DR and while the performance again is not great I am unable to repeat the issue.

Both sites are running the same firmware and the same patch level at the host.  I am fairly certain my performance issues will be tracked down to a misconfiguration.  The host crashing I don't believe is.  I just don't know where to go from this point.  I am preparing to open a case with NetApp and Cisco.  I verified the configuration for the SMB share on the NetApp is identical in both locations and follows their best practices for Hyper-V SMB shares.  I haven't been able to find anything on google or the Cisco support forums yet.  I would appreciate any suggestions.
LVL 1
bullfrog264Asked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Philip ElderTechnical Architect - HA/Compute/StorageCommented:
Is there a DMP file in C:\Windows or C:\Windows\MiniDump ?
0
bullfrog264Author Commented:
There was a minidump and dump file but they didn't correspond with the date or time of the incident.
0
Philip ElderTechnical Architect - HA/Compute/StorageCommented:
I suggest reseating the node. Make sure it is properly seated in its slot. Slim chance there.

If the "physical" ports to the node are going offline is there a configuration issue going on in the chassis networking setup?
0
Creating Active Directory Users from a Text File

If your organization has a need to mass-create AD user accounts, watch this video to see how its done without the need for scripting or other unnecessary complexities.

bullfrog264Author Commented:
So I finally opened a case with Cisco TAC after discussing my issue with my NetApp Sales engineer.  The Cisco TAC told me it was related to a driver issue and incompatibility with the TCP Segmentation Offload option on the Service Profile for the host.  I did temporarily disable TCP Segmentation Offload which did resolve the issue.  I upgraded the driver to a later version and tested again with TCP Segmentation Offload but had the same issue.  I suppose for now I have a work around but I am not satisfied.  I would think TCP Segmentation offload would be enough a benefit on my CPU performance that I would not want it disabled.
0
Philip ElderTechnical Architect - HA/Compute/StorageCommented:
A firmware and driver update would probably be the key to resolving the issue. That's up to the manufacturer to make happen though.
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
bullfrog264Author Commented:
I finally resolved the issue by disabling settings based on Cisco TAC recommendations.
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Hyper-V

From novice to tech pro — start learning today.