Question

Server slow - very high disk activity

Asked by: quadraspleen

We have a client who has a real issue with their server. the spec is:

Supermicro tower with hot-swap bays
Supermicro X7DVL-E motherboard
Intel Xeon 5405 CPU
3GB ECC Kingston DDR2 RAM
4x WD-10EACS 1Tb HDD in RAID-5 with hot-spare
Single RAID-5 container partitoned into two drives: C: 50Gb; D: 1.77Tb
On-board Intel ESB2 RAID controller with latest 8.5 drivers from Intel
Latest build and updates for SBS 2003 R2

20 or so client machines. Reasonably large Exchange MDB (16Gb) Lots of data transfer back and forth with some quite large files being transferred to and from the data drive.

Gigabit Ethernet network with structured cabling.

Issues:

1. The server is very slow and unresponsive at totally random times. When logged into it over RDP, I can click something and it may open straight away, it may take up to 1 minute for it to happen. This will happen over and over again in the same session. All the while, task manager says there is nothing going on that might cause it

2. Users are reporting "lock-ups" and "lock-outs" when accessing the drives across the network. This can happen two or three times in a day or not for two weeks.

3. The disk light on the front of the server is on _all the time_ as if the disk is really churning badly.

4. When using Matrix Storage Manager (which frequently crashes on opening and is very slow to use) to view the RAID, it says all is OK with the RAID, but when you right click to enable disk cache and hard-drive data cache, you can enable them but, crucially, when you next open MSM, they will be disabled again. The same thing applies when enabling both options at device manager/hardware level.

5. The server sometimes takes ages to log us in over RDP. This morning, the user reported no acess to the users and it took us over 10 minutes to get a log-in screen.

6. Software and process explorer report bizarre amounts of resources being used by random programs: Backup Exec Remote agent maxing out on I/O; SQL Server from SBS monitoring doing the same; ESET HTTP server (updates AV clients on the network) maxes out in memory and I/O. Crucially, however, when the server is doing it's slow thing, and we stop one or more of these services, it temporarily gets better.

7. The machine is _always_ paging to HDD. It seems to be maxed out on physical RAM all the time. It's only x32 so we can only stick 3Gb in it.

We have set up perfmonitors on the HDD and we are seeing reasonably normal I/O most of the time. The av. disk queue length is now 1.5, the average idle time is now 95, but other times these will be very different.

Scanned properly for viruses and spyware. RAM has been tested.

My gut feeling is we have a duff RAID controller but I'd really like an expert to have an opinion on this.
Please don't post a reply with basic stuff in it, no matter how helpful you think you are being! We now need someone who _really_ knows what they are talking about,

This Question has been solved and asker verified All Experts Exchange premium technology solutions are available to subscription members.

Subscribe now for full access to Experts Exchange and get

Instant Access to this Solution

  • Plus...
  • 30 Day FREE access, no risk, no obligation
  • Collaborate with the world's top tech experts
  • Unlimited access to our exclusive solution database
  • Never be left without tech help again

Subscribe Now

Asked On
2009-07-10 at 03:02:19ID24559483
Tags

server slow hard-disk RAID

Topics

Computer Servers

,

Computer Hard Drives

,

SBS Small Business Server

Participating Experts
3
Points
0
Comments
12

Trusted by hundreds of thousands everyday for fast, accurate and reliable tech support.

  • "The time we save is the biggest benefit of Experts Exchange to Warner Bros. What could take multiple guys 2 hours or more each to find is accessed in around 15 minutes on Experts Exchange." Mike Kapnisakis, Warner Bros.
  • "Our team likes having a resource that is more secure than just using Google and most experts using this service really know their stuff. It's nice to look here first versus using Google." Dayna Sellner, Lockheed Martin
  • "Anytime that I've been stumped with a problem, 9 out of 10 times Experts Exchange has either the accepted solution or an open discussion of the potential solution to the problem." Kenny Red, eBay Inc.

See what Experts Exchange can do for you.

Got a question?

We've got the answer.

Experts Exchange has been collecting answers to technology questions since 1996…3 million and counting! If you have a question, chances are we already have your answer.

Screenshot of Experts Exchange Knowledgebase

Need individual assistance?

Our experts are ready to help.

If you can't find the exact answer you're looking for, ask our exclusive community of 50,000 experts. You’ll get a personalized answer from a trusted professional.

Screenshot of Experts Exchange Knowledgebase

Want to learn from the best?

Read articles from industry experts.

Thousands of free tech tips, tricks, how-to’s and tutorials are available in our peer reviewed articles section. See for yourself how smart our experts are, no login required.

Screenshot of an Article

Working on a long term project?

Store your work and research.

Save solutions to your questions, answers you’ve discovered through searching plus helpful articles in your personal knowledgebase for easy future access.

Screenshot of Experts Exchange Knowledgebase

Access the answers to your technology questions today.

Subscribe Now

30-day free trial. Register in 60 seconds.

What Makes Experts Exchange Unique?

Members of the expert community talk about why the experience at Experts Exchange is different than what you will find anywhere else.

Trusted by the world's most respected brands.

image of each brand's logo

Faithfully serving IT professionals since 1996.

Experts Exchange Logo

Try it out and discover for yourself.

Subscribe Now

30-day free trial. Register in 60 seconds.

Related Solutions

  1. Steps to get rid of Spyware
    I suspect I am spending a lot more time than necessary to remove Spyware. If the PC is obviously infected with a lot of Spyware (slow especially to boot, lots of pop-ups, many many proecesses are running in Windows Task Manger), should I (1) Remove all Spyware Programs I ca...
  2. Fan attached to motherboard running at high speeds
    Hi experts, For some reason my Dell 4700C (smaller compact desktop style) started making a loud high speed fan noise. I noticed that when I do certain tasks or go to certain web pages it will start winding up and when I get out of whatever I was doing it calms down. I also n...
  3. spyware dlls
    im running security task manager on my computer and it shows some dlls with high usage of memory i think they are spywares slowing my computer down can anyone suggest a fix, any antivirus / antispyware that can remove them? albaalb.dll afcdfnag.dll tnztnvkm.dll
  4. Server has very high  network utilization
    I have a 10/100 card on the server and switch. The clients are maybe 15 at most. SBS on a desktop motherboard to be honest. RAID 1 with 4GB ram. Firewall in place. I can easily see every source and destination address and port. Internal to external policy was tightene...

Free Tech Articles

  1. WARNING: 5 Reasons why you should NEVER fix a computer for free.
    It is in our nature to love the puzzle. We are obsessed. The lot of us. We love puzzles. We love the challenge. We thrive on finding the answer. We hate disarray. It bothers us deep in our soul. W...
  2. SCCM OSD Basic troubleshooting
    SCCM 2007 OSD is a fantastic way to deploy operating systems, however, like most things SCCM issues can sometimes be difficult to resolve due to the sheer volume of logs to sift through and the dispe...
  3. Migrate Small Business Server 2003 to Exchange 2010 and Windows 2008 R2
    This guide is intended to provide step by step instructions on how to migrate from Small Business Server 2003 to Windows 2008 R2 with Exchange 2010. For this migration to work you will need the fo...
  4. Create a Win7 Gadget
    This article shows you how to create a simple "Gadget" -- a sort of mini-application supported by Windows 7 and Vista. Gadgets can be dropped anywhere on the desktop to provide instant information, ...
  5. Outlook continually prompting for username and password
    There have been a lot of questions recently regarding Outlook prompting for a username and password whilst using Exchange 2007. There are a few reasons why this would happen and I will try to cover t...
  6. Backup Exchange 2010 Information Store using Windows Backup
    There seems to be quite a lot of confusion around the ability to backup Exchange 2010 using the built in Windows Backup feature. This stems from the omission of this feature prior to Exchange 2007 s...

Cloud Class Webinars

  1. Avoiding Bugs in Microsoft Access
    Alison Balter takes and in-depth look at avoiding bugs in Access. In this webinar you will learn about using the immediate window to debug your applications, invoking the debugger, using breakpoints to troubleshoot, stepping through code, setting the next statement to execute, ...
  2. Top 10 Best New Features in Visio 2010
    Scott Helmers gives live demonstrations of the top 10 new features in Visio 2010. This webinar will teach you how to create compelling diagrams by adding shapes to the page with a single click, linking the shapes in a diagram to data in Excel (or SQL Server, or SharePoint), ...
  3. IT Consultant Business Secrets Revealed
    Michael Munger, Experts Exchange tech pro and IT consultant, pulls back the curtain on his very successful businesses and answers question on every IT consultant and business owner should know about. He shares secrets on what he did to solve the 5 most common problems in IT, ...
  4. Disaster Recovery and Business Continuity
    Quest CTO, Mike Billon, gives an overview of the steps involved in building a dunamic disaster recovery plan. Through case studies and an examination of software/hardware tooles for monitoring and testing, you'll gain a better understandin of where you are, where you want ...
  5. Organize Your Visio Diagrams with Containers and Lists
    Scott Helmers uses cross functional flowcharts, wireframe diagrams, data graphic legends and seating charts to teach you: how to ustilize all three new structured diagram components in Visio 2010, the best practices for organizeing shapes in previous version of Visio, how to organize ...
  6. How to Us Objects, Properties, Events and Methods in Microsoft Access
    Alison Dalter gives an in-depbth look at objects, properties, events and methods in Microsoft Access. In this webinar you will learn about using the object browser, referring to objects, working with properties and methods, working with object variables, understanding the ...

Join the Community

Give a Little. Get a Lot.

Join the community of experts here and help other tech pros by answering question in your area of expertise. You can earn FREE access to all Experts Exchange's premium features and resources.

Join the Community

Answers

 

by: DCMBSPosted on 2009-07-10 at 03:57:34ID: 24821958

I had a very similar issue some time ago.  It turned out that the RAID 5 was unable to cope with the high level of update transactions generated by the users due to the overhead of calculaing the redundancy info during write operations.  This company was a data processing company and updated lots of access databases stored on the server.  Our solution at that time was to restructure their disks as two RAID 1 arrays.  This had a dramatic effect on the performace as the write performance improved several fold.  Something similar could be happening here.
Incidently I thought the Max RAM for 32 bit windows is 4GB.

 

by: quadraspleenPosted on 2009-07-10 at 04:06:28ID: 24821999

Hi there,

Very useful suggestion - thanks for that. We had had a similar idea of taking up a dedicated RAID controller and creating a new container and doing some acronis magic, but we weren't sure. How did you actually identify that that was the issue? Did you have any logging software? What gave you the final piece of the puzzle?

The Max RAM for 32-bit is limited to 2Gb but can address up to 4 with the /3Gb switch. I have been advised not to use that switch on SBS/Exchange boxes. Perhaps this is wrong?

 

by: DCMBSPosted on 2009-07-10 at 04:35:11ID: 24822145

Yeah we had a lot of trouble diagnosing it.  The main symtons we had were that when the server locked up the disk queue went through the roof and as the queue came down the server started responding.  We tried initially just putting in a single large IDE disk and getting that to impersonate the RAID 5. This just blew the RAID 5 away for performance so we came up with the idea of the RAID 1s.  

All our servers have 4GB of RAM.  I only use the /3GB of RAM on database servers where I need as much RAM as I can get for the Database engine.  It can have a negative effect as it limits the O/S to just 1GB.

 

by: DCMBSPosted on 2009-07-10 at 04:54:16ID: 24822266

A bit more info about memory.

You are basically right when you say that the /3GB switch should not be used on SBS machines. However the machine can have up to 4GB installed.  It will Divide thisd into 2GB Kernal memory and 2GB Application memory.

http://www.brianmadden.com/blogs/brianmadden/archive/2004/02/19/the-4gb-windows-memory-limit-what-does-it-really-mean.aspx

 

by: MesthaPosted on 2009-07-10 at 08:17:38ID: 24824185

Your problem is a single RAID 5 array.
RAID 5 isn't very fast to begin with.
Then you have the additional problem that Exchange is a very high transactional database. Everytime something happens, Exchange writes to two locations at the same time. Add that to the poor performance of a RAID 5 array and you have a system that is thrashing itself in to the ground.

Throwing memory at the problem is not going to help, because you have a major disk bottleneck. Realistically on a complete redesign of the storage structure is going to improve matters. Anything else is basically "tinkering" with the edges.

Simon.

 

by: quadraspleenPosted on 2009-07-10 at 11:27:28ID: 24826063

Thanks for the replies.

Simon, I have many customers with very similar setups i.e more than 10 users with SBS 2K3 and a RAID-5; some of them have way older and lower spec servers (older disks, too) than this one with nowhere near the issues we are seeing here. Not even close.

I hear what you are saying, but having spoken to Mr. Supermicro tech guru today, he seems to think that the ESB2 doesn't like SATA-2 drives in RAID-5 format and has advised me to jumper them to restrict the transfer rate to SATA-1 (whcih would seem a retrograde step, perhaps, but if it fixes it...). He says he has seen this exact same spec and issues with the fix just mentioned sorting it out. If it doesn't, we will be adding a dedicated RAID card and setting up a RAID-1+0 and ghosting the partitons.

I will keep the Q posted, if anyone has anything else to add, please feel free. Thanks for the replies, again.

 

by: MesthaPosted on 2009-07-10 at 14:17:45ID: 24827510

The fact that you have other servers running in the same configuration doesn't mean it is the right thing to do. I call that the drink drivers excuse.
The simple fact is that Exchange is very hard on its storage, a single RAID array will always be a major bottleneck. If you have four disks then I would have two mirror arrays so the database and logs can be split.

Simon.

 

by: robocatPosted on 2009-07-11 at 04:51:14ID: 24830099


First, when running a high I/O server, you should always use SAS disks, because SATA is not suitable for such environments. SATA is never recommended for Exchange environments, even for small ones.

Second, you should always run a raid-5 controller with write back cache enabled. Check if your raid controller has cache memory on board and a (working) battery backup. If your raid controller lacks any of these, get a controller that does.
This will speed up write performance (and general server performance) *significantly*.

You should also look at the memory that the individual processes are using. Heavy paging will always kill your machine. Disable individual processes if needed to avoid paging as much as possible, perhaps you're trying to do too much on a single server.


 

by: quadraspleenPosted on 2009-07-13 at 08:25:41ID: 24840625

Hello again and thanks for the extra comments.

This is NOT an Exchange issue! Nor is it a RAID issue. I hear and agree with everything that is being said about RAID, but in this application it is not the issue. The server is not under enough load for it to be the RAID type failing us. It does it witrh no clients connected to the server right after a clean reboot and Exchange services disabled.  I also know we should, in a perfect world, be using SAS disks, but that is not the configuration we have, nor will we be changing it.

As I said earlier, and Simon, I do hear what you are saying, but when we have a configuration whereby changing it would mean serious disruption to the end-user, not to mention a cash implication, and also two very similar companies, who have a similar numbers of users, with a very similar MDB size and similar numbers of emails going through the business each day, with _identical_ hardware and one server does it and the other doesn't, I can be pretty sure it is not the users/environment causing it, however suitable or not it may be. It has nothing to do with drink-driving, nor is it an excuse. It's just the way it is.

We have faulty hardware. We have had this confirmed to us by Supermicro and Intel, who have advised us to replace one of the drives (which, it turns out, is faulty) and then with regard to Exchange, to do what we were going to do anyway, which was to stick a dedicated RAID card in there (with battery and memory, Robocat!) and create a seperate drive/container for the Exchange DB and temp/log files. We really don't want to be migrating their whole system to ghosted drives unless we absolutely have to, as it will be very disruptive to all concerned. We will leave the system and data on the RAID-5 and migrate Exchange alone to the new ocntainer and see how it goes. If the system is still mullered, we shall re-evaluate.

Thanks for everone's time.

 

by: quadraspleenPosted on 2009-07-31 at 14:08:32ID: 24992468

Well an update on this: We have now changed the board and all the SATA cables and we still have an issue. The drive in port0 on the controller is the one that is churning, so we have now replaced that drive and are rebuilding the RAID. I will post results as and when...

 

by: DCMBSPosted on 2009-08-07 at 10:11:59ID: 25044923

This should be PAQed. Quadraspleen should post his resolution and accept it as the answer.  There is useful info here.

 

by: quadraspleenPosted on 2009-08-07 at 14:24:07ID: 25047074

We had a faulty RAID member disk. After replacing the drive on port3 with new hardware, and a 6 day RAID rebuild, the server now flies along. Disk access times are now at least three times what they were, there are no lockups or freezes, even with the full complement of eighteen users, and Backup Exec now runs a backup of over 1Tb of data in little over seven hours at 1600Mb/s. This compares to over sixteen hours (750mb/s) previously.

The interesting thing is that the faulty drive did not show up as faulty in the RAID console nor when ithe RAID was built and verified. The only clue was the churning hard drive light on port3 and the slow activity.

Fixed. Thanks for everybody's input. All useful stuff, as DCMBS says.

20120131-EE-VQP-002

3 Ways to Join

30-Day Free Trial

The Experts

98% positive feedback on 31,087 answers since March 2000. angeliii is a Microsoft Most Valuable Professional for his work with MS SQL Server & Develoment.

He has also proven his knowledge of Visual Basic Programming, PHP Scripting and Oracle Databases.

The Experts

97% positive feedback on 10,752 answers since July 2000. lrmoore has more than 18 years experience in the networking industry.

The six-time Mircosoft MVPs specialties include firewalls, virtual private networking, and network management.

Testimonials

"...and excellent source for support... Kind of like having your very own IT dept." Electriciansnet

Testimonials

"I was apprehensive at signing up at first. However... it has already made my life as an IT administrator much easier." JaCrews

Testimonials

"WOW! You guys have great, active, and knowledgeable people on here." moore50

Business Clients

Business Clients

In the Press

"If you’ve got a question... Experts Exchange can supply an answer.”

In the Press

"...an invaluable aid for both IT professionals and those who require tech support."

In the Press

"where IT professionals provide quick answers on just about any topic"

Business Account Plans

Loading Advertisement...