Solved

File Server Performance Issues

Posted on 2013-01-24
11
498 Views
Last Modified: 2013-04-24
Greetings,

I have a 64-bit Windows Server 2008 Standard file server that is giving me some grief.  Basically, after a variable amount of time (days to weeks), my users will start noticing that they cannot save files or log into our Terminal Servers.  The common denominator is that the file server hosts the roaming profiles and the HOME and shared directories.  In total, the server has just over 5.5 million files.  It is virtual (VMware).  We have about 450 users, with 300-400 being concurent at peak times.  The roaming profiles are only used for Terminal Server session (Server 2003).  upon reboot, everything begins to function normally.  The only addition programs on the server are Trend OfficeScan (antivirus) and Diskeeper Undelete (salvage program).  Both have been in place for several months.  This problem has only been recent (following a physical move of our server environment in November 2012).  However, there doesn't seem to be an obvious changes or errors.  There are no Event Viewer errors to shed any light.  I've heard (and read) that 4 million files can be a threshold for certain server performance issues, especially backups.  However, it is unlikely that we have had any significant changes in file count over the last 9 months.  The server is allocated 8 CPU and 12GB RAM.  I have 5 allocated drives (although one can be deleted at any time).

I know that the sysmptoms are vague, so I'm just trying to get some fresh brainstorming or previous experiences out there.  Is it a bad design to have everything in one server?  We have 3.6 million files in our HOME directories alone.  As the profiles are only hit during logon and logoff of the Terminal Servers, I wouldn't expect the impact to be significant.  The shared and HOME directories are accessed by all users all the time.  Is there some kind of formula I should be following regarding resources versus users versus files?  I've been trying to inquire into best practices, but it seems to be all over the place.  The NIC is never more than 50% utilized (during backups overnight) and CPU utilization is normally lenn than 10%, while memory utilization is 50-90%.  The memory utilization concerned me, but upon reading more into Server 2008, it appeared to be normal.

Anyhow, I understand that this is pretty vague, but I do appreciate any insight that anyone can provide.  Obviously, having to reboot this server during business is not acceptible.

Thanks,

Jeremy
0
Comment
Question by:Jer
  • 5
  • 4
  • 2
11 Comments
 
LVL 14

Assisted Solution

by:RickEpnet
RickEpnet earned 100 total points
ID: 38816565
You said it is virtual. Have you added any new servers to the host here lately? How much memory and what is the CPU you have given this VM. What do the performance monitors say in VMware as to the CPU and Disk access?
0
 
LVL 119
ID: 38816567
8 vCPU seems excessive, you have not overcomitted CPUs?

very few servers require more than 2vCPU!

excessive vCPU can cause performance issues in the VM.

also, what is the underlying datastore, the VM is stored on?
0
 
LVL 3

Author Comment

by:Jer
ID: 38820549
Rick - We've added 2-3 servers to our virtual environment.  Nothing significant in size.  CPU usage is pretty regular on a daily basis, with values ranging from 250-3000 MHz.  Memory usage is 4-25%.  Virtual Disk Rate is 0-60,000 KBps.  Virtual Disk Requests are 0-1000.  The Network rate is 0-800 Mbps.  Network Packets received are 0-22,500,000, transmitted are 0-8,000,000.

Han - It is quite possible that we've overcommitted vCPU.  We're working with a 3rd-party that supports our VMware environment, while we support the actual servers and the connectivity.  We just moved to virtual last year, so I still have plenty of noob moments with it.  As I couldn't get any clear answers on best practices at the time of migration from physical to virtual (this server was was a fresh build, not P2V), I maintained the specs of our physical environment.  Hence, the 2 sockets and 8 CPU.  As sockets only matter for select applications, I could easily reduce this server to 1 socket and 2-4 vCPU.  Do you know of formulas/guidlines to follow?  I'm not sure I understand your question about the datastore.  What is it that you want to know about the datastore?

Thanks for any input.

Jeremy
0
Are your AD admin tools letting you down?

Managing Active Directory can get complicated.  Often, the native tools for managing AD are just not up to the task.  The largest Active Directory installations in the world have relied on one tool to manage their day-to-day administration tasks: Hyena. Start your trial today.

 
LVL 119
ID: 38820586
what disks have been configured for your ESXi server?

RAID 5, RAID 10, how many, type, SATA, SAS?

Yes, reduce to 1 cpu, check performance, increase if necessary.
0
 
LVL 14

Expert Comment

by:RickEpnet
ID: 38820739
Is your storage local or an iSCSI or FC SAN? I agree 100% reduce the vCPUs.
0
 
LVL 3

Author Comment

by:Jer
ID: 38820799
The hosts are (3) Dell PowerEdge R710 (8 core, 96GB RAM) servers attachech to a NetApp 3210 SAN with 2 SAS 24x45GB shelf, RAID DP.  This particular server interacts with 3 datastores (SYS_NoRep, Data_Rep, and Data_NoRep).  In general, the hosts 'seem' to be rather underutilized.
0
 
LVL 119
ID: 38820832
okay, well you should have enough IOPS on the disk although because the I/O is virtualised will not perform as well as your filer.

So, why are you not using your filer as a NAS with CIFS shares, why use a Windows Server which is virtual, with a FC or iSCSI LUN. Windows will perform poorly compared to your SAN.

I would recommend the migration of roaming profiles, home drives, group data from your VM server to NetApp CIFS shares. Benefit from SAN Snapshots, Previous versions for end users to perform restores, DeDupe on the volume to reduce space, you can get a Trend Micro plugin for the SAN to do anti virus on the NetApp.

Performance will be superior.

Best practice dump Windows in favour of your NAS with CIFs. We migrate clients file servers to NetApp SANs, at the same time using iSCSI for VMware and NFS for Unix and Linux Clusters, in the same box, yes CIFs, NFS and iSCSI.
0
 
LVL 3

Author Comment

by:Jer
ID: 38828928
We're FC SAN.  While I can appreciate your position on the CIFS shares, the fact is that we are using a Windows file server at this point and I need to know if there is something configured wrong.  My understanding is that our current virtual Windows server should handle its current load without issue.  While I agree that your recommendation could certainly have positive results, it is not something that we are going to do without further understanding any other impact, such as nightly backups and such.  As our environment continues to evolve, there are certainly many aspects that we'll be updating to current best practices.  However, as mentioned, the big issue here is trying to address the current environment before the server stops responding again.  

As of right now, the next step in troubleshooting is moving the drive containing the HOME dirs over to a new Server 2008 R2 machine.  Unfortunately, this means addressing 450+ user profiles (including Terminal Server tab) without any definitive proof that it will address anything.  I'm really hoping to find something obviously wrong with the file server or virtual environment.  

We are looking to address the vCPU allocation.
0
 
LVL 119

Accepted Solution

by:
Andrew Hancock (VMware vExpert / EE MVE^2) earned 400 total points
ID: 38828939
reduce cpu and use vmxnet3 network interface, microsoft has never got roaming profiles right, they still cause issue, at login and logoff.

you may want to consider folder redirection

reducing the size of profiles, of the use of Profile Unity by Liquid Labs
0
 
LVL 3

Author Comment

by:Jer
ID: 38944234
It looks like one of the contributing factors may have been our Undelete program.  I'm working with Diskeeper to see if something is amiss.  I've had it disabled for 3+ weeks and the server has been stable.  Still monitoring.  Will look into other suggestions.
0
 
LVL 3

Author Comment

by:Jer
ID: 39109657
Greetings.  The problem was isolated to the use of Undelete.  I do appreciate all teh other suggestions, as I'm always looking to identify best practices.  Thanks.
0

Featured Post

Microsoft Certification Exam 74-409

Veeam® is happy to provide the Microsoft community with a study guide prepared by MVP and MCT, Orin Thomas. This guide will take you through each of the exam objectives, helping you to prepare for and pass the examination.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

When converting a physical machine to a virtual machine using VMware vCenter Converter Standalone or vCenter Converter Enterprise, if an adapter type is not selected during the initial customization the resulting virtual machine may contain an IDE d…
In this article, I will show you HOW TO: Install VMware Tools for Windows on a VMware Windows virtual machine on a VMware vSphere Hypervisor 6.5 (ESXi 6.5) Host Server, using the VMware Host Client. The virtual machine has Windows Server 2016 instal…
To efficiently enable the rotation of USB drives for backups, storage pools need to be created. This way no matter which USB drive is installed, the backups will successfully write without any administrative intervention. Multiple USB devices need t…
This tutorial will walk an individual through setting the global and backup job media overwrite and protection periods in Backup Exec 2012. Log onto the Backup Exec Central Administration Server. Examine the services. If all or most of them are stop…

776 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question