Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people, just like you, are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
Solved

File Server Performance Issues

Posted on 2013-01-24
11
499 Views
Last Modified: 2013-04-24
Greetings,

I have a 64-bit Windows Server 2008 Standard file server that is giving me some grief.  Basically, after a variable amount of time (days to weeks), my users will start noticing that they cannot save files or log into our Terminal Servers.  The common denominator is that the file server hosts the roaming profiles and the HOME and shared directories.  In total, the server has just over 5.5 million files.  It is virtual (VMware).  We have about 450 users, with 300-400 being concurent at peak times.  The roaming profiles are only used for Terminal Server session (Server 2003).  upon reboot, everything begins to function normally.  The only addition programs on the server are Trend OfficeScan (antivirus) and Diskeeper Undelete (salvage program).  Both have been in place for several months.  This problem has only been recent (following a physical move of our server environment in November 2012).  However, there doesn't seem to be an obvious changes or errors.  There are no Event Viewer errors to shed any light.  I've heard (and read) that 4 million files can be a threshold for certain server performance issues, especially backups.  However, it is unlikely that we have had any significant changes in file count over the last 9 months.  The server is allocated 8 CPU and 12GB RAM.  I have 5 allocated drives (although one can be deleted at any time).

I know that the sysmptoms are vague, so I'm just trying to get some fresh brainstorming or previous experiences out there.  Is it a bad design to have everything in one server?  We have 3.6 million files in our HOME directories alone.  As the profiles are only hit during logon and logoff of the Terminal Servers, I wouldn't expect the impact to be significant.  The shared and HOME directories are accessed by all users all the time.  Is there some kind of formula I should be following regarding resources versus users versus files?  I've been trying to inquire into best practices, but it seems to be all over the place.  The NIC is never more than 50% utilized (during backups overnight) and CPU utilization is normally lenn than 10%, while memory utilization is 50-90%.  The memory utilization concerned me, but upon reading more into Server 2008, it appeared to be normal.

Anyhow, I understand that this is pretty vague, but I do appreciate any insight that anyone can provide.  Obviously, having to reboot this server during business is not acceptible.

Thanks,

Jeremy
0
Comment
Question by:Jer
  • 5
  • 4
  • 2
11 Comments
 
LVL 14

Assisted Solution

by:RickEpnet
RickEpnet earned 100 total points
ID: 38816565
You said it is virtual. Have you added any new servers to the host here lately? How much memory and what is the CPU you have given this VM. What do the performance monitors say in VMware as to the CPU and Disk access?
0
 
LVL 119
ID: 38816567
8 vCPU seems excessive, you have not overcomitted CPUs?

very few servers require more than 2vCPU!

excessive vCPU can cause performance issues in the VM.

also, what is the underlying datastore, the VM is stored on?
0
 
LVL 3

Author Comment

by:Jer
ID: 38820549
Rick - We've added 2-3 servers to our virtual environment.  Nothing significant in size.  CPU usage is pretty regular on a daily basis, with values ranging from 250-3000 MHz.  Memory usage is 4-25%.  Virtual Disk Rate is 0-60,000 KBps.  Virtual Disk Requests are 0-1000.  The Network rate is 0-800 Mbps.  Network Packets received are 0-22,500,000, transmitted are 0-8,000,000.

Han - It is quite possible that we've overcommitted vCPU.  We're working with a 3rd-party that supports our VMware environment, while we support the actual servers and the connectivity.  We just moved to virtual last year, so I still have plenty of noob moments with it.  As I couldn't get any clear answers on best practices at the time of migration from physical to virtual (this server was was a fresh build, not P2V), I maintained the specs of our physical environment.  Hence, the 2 sockets and 8 CPU.  As sockets only matter for select applications, I could easily reduce this server to 1 socket and 2-4 vCPU.  Do you know of formulas/guidlines to follow?  I'm not sure I understand your question about the datastore.  What is it that you want to know about the datastore?

Thanks for any input.

Jeremy
0
Flexible connectivity for any environment

The KE6900 series can extend and deploy computers with high definition displays across multiple stations in a variety of applications that suit any environment. Expand computer use to stations across multiple rooms with dynamic access.

 
LVL 119
ID: 38820586
what disks have been configured for your ESXi server?

RAID 5, RAID 10, how many, type, SATA, SAS?

Yes, reduce to 1 cpu, check performance, increase if necessary.
0
 
LVL 14

Expert Comment

by:RickEpnet
ID: 38820739
Is your storage local or an iSCSI or FC SAN? I agree 100% reduce the vCPUs.
0
 
LVL 3

Author Comment

by:Jer
ID: 38820799
The hosts are (3) Dell PowerEdge R710 (8 core, 96GB RAM) servers attachech to a NetApp 3210 SAN with 2 SAS 24x45GB shelf, RAID DP.  This particular server interacts with 3 datastores (SYS_NoRep, Data_Rep, and Data_NoRep).  In general, the hosts 'seem' to be rather underutilized.
0
 
LVL 119
ID: 38820832
okay, well you should have enough IOPS on the disk although because the I/O is virtualised will not perform as well as your filer.

So, why are you not using your filer as a NAS with CIFS shares, why use a Windows Server which is virtual, with a FC or iSCSI LUN. Windows will perform poorly compared to your SAN.

I would recommend the migration of roaming profiles, home drives, group data from your VM server to NetApp CIFS shares. Benefit from SAN Snapshots, Previous versions for end users to perform restores, DeDupe on the volume to reduce space, you can get a Trend Micro plugin for the SAN to do anti virus on the NetApp.

Performance will be superior.

Best practice dump Windows in favour of your NAS with CIFs. We migrate clients file servers to NetApp SANs, at the same time using iSCSI for VMware and NFS for Unix and Linux Clusters, in the same box, yes CIFs, NFS and iSCSI.
0
 
LVL 3

Author Comment

by:Jer
ID: 38828928
We're FC SAN.  While I can appreciate your position on the CIFS shares, the fact is that we are using a Windows file server at this point and I need to know if there is something configured wrong.  My understanding is that our current virtual Windows server should handle its current load without issue.  While I agree that your recommendation could certainly have positive results, it is not something that we are going to do without further understanding any other impact, such as nightly backups and such.  As our environment continues to evolve, there are certainly many aspects that we'll be updating to current best practices.  However, as mentioned, the big issue here is trying to address the current environment before the server stops responding again.  

As of right now, the next step in troubleshooting is moving the drive containing the HOME dirs over to a new Server 2008 R2 machine.  Unfortunately, this means addressing 450+ user profiles (including Terminal Server tab) without any definitive proof that it will address anything.  I'm really hoping to find something obviously wrong with the file server or virtual environment.  

We are looking to address the vCPU allocation.
0
 
LVL 119

Accepted Solution

by:
Andrew Hancock (VMware vExpert / EE MVE^2) earned 400 total points
ID: 38828939
reduce cpu and use vmxnet3 network interface, microsoft has never got roaming profiles right, they still cause issue, at login and logoff.

you may want to consider folder redirection

reducing the size of profiles, of the use of Profile Unity by Liquid Labs
0
 
LVL 3

Author Comment

by:Jer
ID: 38944234
It looks like one of the contributing factors may have been our Undelete program.  I'm working with Diskeeper to see if something is amiss.  I've had it disabled for 3+ weeks and the server has been stable.  Still monitoring.  Will look into other suggestions.
0
 
LVL 3

Author Comment

by:Jer
ID: 39109657
Greetings.  The problem was isolated to the use of Undelete.  I do appreciate all teh other suggestions, as I'm always looking to identify best practices.  Thanks.
0

Featured Post

Free learning courses: Active Directory Deep Dive

Get a firm grasp on your IT environment when you learn Active Directory best practices with Veeam! Watch all, or choose any amount, of this three-part webinar series to improve your skills. From the basics to virtualization and backup, we got you covered.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This article will show you how to create an ISO CD-ROM/DVD-ROM image (*.iso), and MD5 checksum signature, for use with VMware vSphere Hypervisor 6.5 (ESXi 6.5). It's a good idea to compare checksums, because many installations fail because of a corr…
This article explains how to install and use the NTBackup utility that comes with Windows Server.
This tutorial will walk an individual through the steps necessary to configure their installation of BackupExec 2012 to use network shared disk space. Verify that the path to the shared storage is valid and that data can be written to that location:…
This tutorial will show how to configure a single USB drive with a separate folder for each day of the week. This will allow each of the backups to be kept separate preventing the previous day’s backup from being overwritten. The USB drive must be s…

828 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question