troubleshooting Question

Connectivity "blips" on a server 2012 R2 Terminal Server

Avatar of Brandon Bazemore
Brandon Bazemore asked on
Windows Server 2012Dell
4 Comments1 Solution970 ViewsLast Modified:
Hi there!  We've been troubleshooting this issue for months without much success, so wanted to put it out there to people smarter than us. ;)  

Environment:
Dell c6100 xs23  (4 nodes)
Host nodes include one running a file share that includes user redirected folders as well as two terminal server session hosts.

Issue:
Users will notice the (multiple times a day) that:
-Desktop will flicker (icons go away and back quickly)
-File shares will have to be refreshed to see updates (so if another user saves a file, it won't just "appear" for the rest of the users viewing that folder)
-Chrome (which is using redirected data folders) will crash.

The only logging issue we see is delayed write failures.  That in combo with above indicates to us that somehow there is a loss in connectivity between the session hosts and the file server (simple windows file server - no clustering, DFS or otherwise).  Easy, right?  

Here is what we've tried:
-turning off VMQ on the base NIC, the hyper-v switch, and the host switch
-turning off srv-io on all
-running both the file server and one of the session hosts on the same physical node while running the other session host on a separate node.  No change here which eliminated most of our network suspicions as it should literally be communicating only over the hyper-v vSwitch.
-disabling LACP and using a single connection
-upgrading firmware on the broadcom NIC's (there was a problem on previous firmware, but it was a FULL disconnect requiring reboot).
-Looked at KB 2842111 - we don't see any massive number of handles, so didn't apply the fix.
-Looked at KB 2878182 - we don't see any non-responsive threads, so didn't apply.
-The actual even id error of the delayed write is:
EventID 50 / Source: MUP or mrxsmb  (some of both)
  " {Delayed Write Failed} Windows was unable to save all the data for the file \;H:00000000387e736f\data\Users\iamauserexample .. nts\Chrome_Settings\CrashpadMetrics-active.pma. The data has been lost. This error may be caused by a failure of your computer hardware or network connection. Please try to save this file elsewhere."
-no corresponding errors are seen on the file server that happen at the same time
-no corresponding errors on the hyper-v host


Interestingly, they previously had a server 2008 instance which did not have the same issue. With all of the above this is leading us to believe it may be an issue with server 2012 R2.
Join the community to see this answer!
Join our exclusive community to see this answer & millions of others.
Unlock 1 Answer and 4 Comments.
Join the Community
Learn from the best

Network and collaborate with thousands of CTOs, CISOs, and IT Pros rooting for you and your success.

Andrew Hancock - VMware vExpert
See if this solution works for you by signing up for a 7 day free trial.
Unlock 1 Answer and 4 Comments.
Try for 7 days

”The time we save is the biggest benefit of E-E to our team. What could take multiple guys 2 hours or more each to find is accessed in around 15 minutes on Experts Exchange.

-Mike Kapnisakis, Warner Bros