Solved

Win 2008 R2 File Server Latency Issues

Posted on 2014-01-07
7
638 Views
Last Modified: 2014-01-08
First let me say this issue has yet to produce an event in event viewer.

We have a single Windows 2008 R2 File Server (VM in ESXi 5.1 in a VSA redudant environment) that has a problem.  It's only happened 7 times since last May 2013, but it's of a huge concern because it halts the entire company.  It also doesn't affect any other of the 8 VM we have in the VSA configuration or 7 VM in the vCenter (not in VSA).

The issue is that the file server slowly comes to a halt.  The latency issue starts off only affecting a few users and then escalates to the point where the server is non responsive at the console, but will service remote requests in 3-5 minutes.  The symptoms take anywhere from 3-5 hours to first rear their head to bringing the company to a grinding halt. (File Server is very important to us)

A reboot of the server immediately fixes the problem, however, we also have folder redirection turned on (stored on this server) for appdata roaming, desktop, and favorites.  Reboot of the file server also requires a reboot of all the users workstations, about 60 or so.

The file server has only File Services and FSRM installed on the device, but the problem was occuring before FSRM

Management wants an explanation and resolution and I basically have no idea where to start.  There's no logs, no events, and we simply do not currently have a third party monitoring tool that would record these happenings for review.  

VMWare ESXi reports no unusual service requests times in diskIO, network, CPU, or RAM usage of the machine during these times.  

Time of day has been anywhere during working hours, morning, afternoon, and right before leaving.

In addition, if I wanted to start new file server from scratch, can I take and boot a new vm, attached the vmdk files to the new VM as datastores, and boot those into windows and receive all the permissions and drive space without having to perform a restore of any kind? (I kind of suspect windows will want to format those during diskmgmt operations but I'm not sure).

Any suggestions on where to move next?
0
Comment
Question by:PriorityResearch
  • 3
  • 3
7 Comments
 
LVL 3

Accepted Solution

by:
WiReDWolf earned 500 total points
ID: 39763741
I've seen this behaviour with volume shadow copy services hanging trying to take a snapshot.  Do you take snapshots of your data during the day?
0
 

Author Comment

by:PriorityResearch
ID: 39763783
I believe Appassure uses VSS to take snapshots every hour.  Our Appassure resides on a separate host with separate diskIO, but still resides within the vCenter environment.

Appassure has been present since we installed the VSA environment.  

What tends to cause the hang?

Any ideas how to turn on logging to see if that the issue, or take preventive steps to stop it from rearing its ugly head again?

Edit: Does Previous Versions also use VSS?
0
 
LVL 3

Expert Comment

by:WiReDWolf
ID: 39763915
Previous Version does also use VSS and I've found that VSS doesn't play nice with multiple partners.  

I have a terminal server that exhibits identical behaviour to yours.  Every so often it will develop a resource leak and eventually choke itself out to the point the server is not functional.  Because the leak is gradual the server never logs anything until it gets to the point that it can't log anything.  A reboot solves the problem until it happens again which can be anywhere from a day to a couple of months.

It's annoying for us to have this server drop off every once in a while but it doesn't sound like it affects you as much as it does us.  

A suggestion would be to install some monitors to keep an eye on your resources.  If you develop a leak it would be good to be able to stop it before it takes the server down.  As an MSP I have my own tools to use but I'm pretty sure you can configure the Windows monitors.  If not I'll help you find some third party software.

Based on your detailed description of the issue I think the problem and solution lies with this one VM and it has nothing to do with being a VM.
0
Problems using Powershell and Active Directory?

Managing Active Directory does not always have to be complicated.  If you are spending more time trying instead of doing, then it's time to look at something else. For nearly 20 years, AD admins around the world have used one tool for day-to-day AD management: Hyena. Discover why

 

Author Comment

by:PriorityResearch
ID: 39763960
I'd whole heatedly agree with you on your last statement. I've requested solar winds in the past but have never been able to make the roi seem logical to the powers that be. What third party tools do you use or would you suggest? Windows performance monitors are local and I'd really prefer something that has central reporting as we'd monitor more than just that one machine.
0
 
LVL 30

Expert Comment

by:pgm554
ID: 39764143
Agree, it sounds like a classic memory leak.

See:

http://technet.microsoft.com/en-us/sysinternals/bb896653.aspx
0
 

Author Comment

by:PriorityResearch
ID: 39765391
Thank you for the support so far.

I've checked in on a few things and noticed that for some reason my Previous Versions had to triggers.

The first which was intentional was 7am to 7pm every hour daily.

The second was daily at noon.

Perhaps the combination of the two triggers was causing a hang occasionally?  If that does not resolve the issue, I will try to disable Previous Versions entirely (I don't want to because the speed of recovery for most files deleted or accidentally saved etc is much faster than mounting a restorepoint and searching them in Appassure). Going to mark this as solved.
0
 
LVL 3

Expert Comment

by:WiReDWolf
ID: 39766353
Thanks.

I would try disabling Previous Version for a couple of weeks.  If the problem seems resolved then you probably found the cause.

I've used the Zenith platform and found the VSS support in the backup software was what caused the problem far more often.  Zenith backup back-end is StorageCraft.  Once I disabled the VSS support from the backup software I think I've only had a couple outages due to resource leaks.

Almost anything can set up with SNMP traps that can be centrally managed for monitoring.  I've found it to be a pain to set up but there are plenty of third party tools.  Solar Winds is one of them but I'm sure there's more.  Again, I have much of this built into my MSP software so I haven't really spent a lot of time looking.
0

Featured Post

NEW Veeam Agent for Microsoft Windows

Backup and recover physical and cloud-based servers and workstations, as well as endpoint devices that belong to remote users. Avoid downtime and data loss quickly and easily for Windows-based physical or public cloud-based workloads!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

A procedure for exporting installed hotfix details of remote computers using powershell
This article explains how to install and use the NTBackup utility that comes with Windows Server.
This tutorial will give a an overview on how to deploy remote agents in Backup Exec 2012 to new servers. Click on the Backup Exec button in the upper left corner. From here, are global settings for the application such as connecting to a remote Back…
This tutorial will walk an individual through the steps necessary to configure their installation of BackupExec 2012 to use network shared disk space. Verify that the path to the shared storage is valid and that data can be written to that location:…

685 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question