Solved

File Server Performance Issues

Posted on 2013-01-24
11
496 Views
Last Modified: 2013-04-24
Greetings,

I have a 64-bit Windows Server 2008 Standard file server that is giving me some grief.  Basically, after a variable amount of time (days to weeks), my users will start noticing that they cannot save files or log into our Terminal Servers.  The common denominator is that the file server hosts the roaming profiles and the HOME and shared directories.  In total, the server has just over 5.5 million files.  It is virtual (VMware).  We have about 450 users, with 300-400 being concurent at peak times.  The roaming profiles are only used for Terminal Server session (Server 2003).  upon reboot, everything begins to function normally.  The only addition programs on the server are Trend OfficeScan (antivirus) and Diskeeper Undelete (salvage program).  Both have been in place for several months.  This problem has only been recent (following a physical move of our server environment in November 2012).  However, there doesn't seem to be an obvious changes or errors.  There are no Event Viewer errors to shed any light.  I've heard (and read) that 4 million files can be a threshold for certain server performance issues, especially backups.  However, it is unlikely that we have had any significant changes in file count over the last 9 months.  The server is allocated 8 CPU and 12GB RAM.  I have 5 allocated drives (although one can be deleted at any time).

I know that the sysmptoms are vague, so I'm just trying to get some fresh brainstorming or previous experiences out there.  Is it a bad design to have everything in one server?  We have 3.6 million files in our HOME directories alone.  As the profiles are only hit during logon and logoff of the Terminal Servers, I wouldn't expect the impact to be significant.  The shared and HOME directories are accessed by all users all the time.  Is there some kind of formula I should be following regarding resources versus users versus files?  I've been trying to inquire into best practices, but it seems to be all over the place.  The NIC is never more than 50% utilized (during backups overnight) and CPU utilization is normally lenn than 10%, while memory utilization is 50-90%.  The memory utilization concerned me, but upon reading more into Server 2008, it appeared to be normal.

Anyhow, I understand that this is pretty vague, but I do appreciate any insight that anyone can provide.  Obviously, having to reboot this server during business is not acceptible.

Thanks,

Jeremy
0
Comment
Question by:Jer
  • 5
  • 4
  • 2
11 Comments
 
LVL 14

Assisted Solution

by:RickEpnet
RickEpnet earned 100 total points
ID: 38816565
You said it is virtual. Have you added any new servers to the host here lately? How much memory and what is the CPU you have given this VM. What do the performance monitors say in VMware as to the CPU and Disk access?
0
 
LVL 118
ID: 38816567
8 vCPU seems excessive, you have not overcomitted CPUs?

very few servers require more than 2vCPU!

excessive vCPU can cause performance issues in the VM.

also, what is the underlying datastore, the VM is stored on?
0
 
LVL 3

Author Comment

by:Jer
ID: 38820549
Rick - We've added 2-3 servers to our virtual environment.  Nothing significant in size.  CPU usage is pretty regular on a daily basis, with values ranging from 250-3000 MHz.  Memory usage is 4-25%.  Virtual Disk Rate is 0-60,000 KBps.  Virtual Disk Requests are 0-1000.  The Network rate is 0-800 Mbps.  Network Packets received are 0-22,500,000, transmitted are 0-8,000,000.

Han - It is quite possible that we've overcommitted vCPU.  We're working with a 3rd-party that supports our VMware environment, while we support the actual servers and the connectivity.  We just moved to virtual last year, so I still have plenty of noob moments with it.  As I couldn't get any clear answers on best practices at the time of migration from physical to virtual (this server was was a fresh build, not P2V), I maintained the specs of our physical environment.  Hence, the 2 sockets and 8 CPU.  As sockets only matter for select applications, I could easily reduce this server to 1 socket and 2-4 vCPU.  Do you know of formulas/guidlines to follow?  I'm not sure I understand your question about the datastore.  What is it that you want to know about the datastore?

Thanks for any input.

Jeremy
0
 
LVL 118
ID: 38820586
what disks have been configured for your ESXi server?

RAID 5, RAID 10, how many, type, SATA, SAS?

Yes, reduce to 1 cpu, check performance, increase if necessary.
0
 
LVL 14

Expert Comment

by:RickEpnet
ID: 38820739
Is your storage local or an iSCSI or FC SAN? I agree 100% reduce the vCPUs.
0
Does Powershell have you tied up in knots?

Managing Active Directory does not always have to be complicated.  If you are spending more time trying instead of doing, then it's time to look at something else. For nearly 20 years, AD admins around the world have used one tool for day-to-day AD management: Hyena. Discover why

 
LVL 3

Author Comment

by:Jer
ID: 38820799
The hosts are (3) Dell PowerEdge R710 (8 core, 96GB RAM) servers attachech to a NetApp 3210 SAN with 2 SAS 24x45GB shelf, RAID DP.  This particular server interacts with 3 datastores (SYS_NoRep, Data_Rep, and Data_NoRep).  In general, the hosts 'seem' to be rather underutilized.
0
 
LVL 118
ID: 38820832
okay, well you should have enough IOPS on the disk although because the I/O is virtualised will not perform as well as your filer.

So, why are you not using your filer as a NAS with CIFS shares, why use a Windows Server which is virtual, with a FC or iSCSI LUN. Windows will perform poorly compared to your SAN.

I would recommend the migration of roaming profiles, home drives, group data from your VM server to NetApp CIFS shares. Benefit from SAN Snapshots, Previous versions for end users to perform restores, DeDupe on the volume to reduce space, you can get a Trend Micro plugin for the SAN to do anti virus on the NetApp.

Performance will be superior.

Best practice dump Windows in favour of your NAS with CIFs. We migrate clients file servers to NetApp SANs, at the same time using iSCSI for VMware and NFS for Unix and Linux Clusters, in the same box, yes CIFs, NFS and iSCSI.
0
 
LVL 3

Author Comment

by:Jer
ID: 38828928
We're FC SAN.  While I can appreciate your position on the CIFS shares, the fact is that we are using a Windows file server at this point and I need to know if there is something configured wrong.  My understanding is that our current virtual Windows server should handle its current load without issue.  While I agree that your recommendation could certainly have positive results, it is not something that we are going to do without further understanding any other impact, such as nightly backups and such.  As our environment continues to evolve, there are certainly many aspects that we'll be updating to current best practices.  However, as mentioned, the big issue here is trying to address the current environment before the server stops responding again.  

As of right now, the next step in troubleshooting is moving the drive containing the HOME dirs over to a new Server 2008 R2 machine.  Unfortunately, this means addressing 450+ user profiles (including Terminal Server tab) without any definitive proof that it will address anything.  I'm really hoping to find something obviously wrong with the file server or virtual environment.  

We are looking to address the vCPU allocation.
0
 
LVL 118

Accepted Solution

by:
Andrew Hancock (VMware vExpert / EE MVE) earned 400 total points
ID: 38828939
reduce cpu and use vmxnet3 network interface, microsoft has never got roaming profiles right, they still cause issue, at login and logoff.

you may want to consider folder redirection

reducing the size of profiles, of the use of Profile Unity by Liquid Labs
0
 
LVL 3

Author Comment

by:Jer
ID: 38944234
It looks like one of the contributing factors may have been our Undelete program.  I'm working with Diskeeper to see if something is amiss.  I've had it disabled for 3+ weeks and the server has been stable.  Still monitoring.  Will look into other suggestions.
0
 
LVL 3

Author Comment

by:Jer
ID: 39109657
Greetings.  The problem was isolated to the use of Undelete.  I do appreciate all teh other suggestions, as I'm always looking to identify best practices.  Thanks.
0

Featured Post

PRTG Network Monitor: Intuitive Network Monitoring

Network Monitoring is essential to ensure that computer systems and network devices are running. Use PRTG to monitor LANs, servers, websites, applications and devices, bandwidth, virtual environments, remote systems, IoT, and many more. PRTG is easy to set up & use.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
Can anyone please describe how to take a snapshot in VMware? 5 42
VMware 6.0 second VCenter PSC question 3 25
How Veeam backs up VMs 5 55
vmware ver 6.5 8 40
Veeam Backup & Replication has added a new integration – Veeam Backup for Microsoft Office 365.  In this blog, we will discuss how you can benefit from Office 365 email backup with the Veeam’s new product and try to shed some light on the needs and …
In this article, I will show you HOW TO: Suppress Configuration Issues and Warnings Alert displayed in Summary status for ESXi 6.5 after enabling SSH or ESXi Shell.
This Micro Tutorial steps you through the configuration steps to configure your ESXi host Management Network settings and test the management network, ensure the host is recognized by the DNS Server, configure a new password, and the troubleshooting…
This video shows you how easy it is to boot from ISO images for virtual machines with the ISO images stored on a local datastore on the ESXi host.

919 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

21 Experts available now in Live!

Get 1:1 Help Now