Link to home
Start Free TrialLog in
Avatar of Scott Nowacki
Scott Nowacki

asked on

Excessive High Disk errors in Windows 2008 Small Business Server

Hi,

I inherited an SBS 2008 server recently. It was running really slow and bogged way down during reboots and running backups - to the point where it was unusable.

The previous manager had installed Symantec Endpoint Protection and the System Manager on this server and not made exceptions for SQL server files and other things. I removed all of this software because the license expired. The server is running a lot better now, but there are regular spikes in disk activity that set off the monitoring software.

The server is being used by one guy with two email addresses. Most of the time, he's not even in the office and using the network. Just accessing his email.

I loaded up Resource Monitor and found that the processes using the most disk activity are:

C:\Windows\SoftwareDistribution\DataStore\DataStore.edb -- Read 140,354,555 B/Min
C:\WSUS\SUSDB\UpdateServicesDBFiles\SUSDB_log.ldf -- Read 54,234,553 B/Min

There are big sustained reads and writes. The server's memory is pegged at 93% usage (8GB RAM) with the two biggest users being SQL Server. 1.5GB each.

I'm not sure what's going on here, but I need to get the usage down. I don't see any disk errors in the Event Log.

Thanks for the help.

Scott
Avatar of Scott Nowacki
Scott Nowacki

ASKER

Here are samples of the errors coming from the GFI agent:
-----
The test Performance Monitoring Check - Memory Usage failed.

Additional information : 27 Pages/Second
-----
The test Performance Monitoring Check - Physical Disk: Total failed.

Additional information : Disk Time: 52.557%
-----
The test Performance Monitoring Check - Physical Disk: Total failed.

Additional information : Write Queue: 2.138

Thanks,

Scott
Avatar of Nagendra Pratap Singh
Please add SQL server as a category to this question.

Also buy an antivirus.
I wouldn't worry too much about memory utilization, SQL Server and Exchange are both designed to claim as much of the system's physical memory as possible, but they monitor the demand on memory and will release memory as necessary to accommodate other processes.

An average disk queue length over 2 seems unusually high to me, and could indicate poor disk performance (e.g. a failing hard drive, or a drive with many bad blocks), or possibly a lack of physical memory (if the server is low on physical memory, the disk queue could spike as a result of increased paging file use); however, if this server really only services one user then 8GB should be more than sufficient.

As an aside, an entire SBS server for one guy with two mailboxes kinda sounds like rather severe over-kill to me. ;)
...if you Google "store.exe memory utilization" or "sqlsvr.exe memory utilization" you should find more than a few articles explaining Exchange & SQL Server's memory management systems, and why it's normal for those two to show high memory use, and why such memory use doesn't necessarily indicate a problem (I'm stuck using my phone at the moment, otherwise I'd hunt'em down for you).
If you can double the ram to 16G (cost  under $100) and your pagefile reads will go down and disk io will decrease
Ehh...My buddy's company got an SBS 2011 server that I manage for him. He's got a dozen users, each with a mailbox, have four instances of SQL Server running, have WSUS service Windows Updates for 30 workstations, have around 1.5TB of shared folders, and provide Web/FTP services for 100+ Internet-only clients. There server also has 8GB of RAM, and the average disk queue there is consistently .5 or less (I'm not sure it's even using the page file at all).

If you can do all that with 8GB of memory, then to handle one user & two Exchange mailboxes with the same amount of memory should be a doddle.

So, even though memory is so cheap it may as well be free, and doubling it to 16GB won't do any harm, I sincerely doubt it will do any good. I suspect the much more likely problem is a bad disk (or a number of other possibilities, that's just the first one to come to mind).

If your server's memory utilization is at 93%, that tells me there's still 500MB+ of unused physical memory available; and with that much physical memory still free to be used the paging file shouldn't be under too much pressure. Therefore, the high disk queue is not the result of too little memory.

At least that's my interpretation. ;)
SBS 2008 server doesn't support SQL on the same machine (unless it is SQL express which is default and internally used for some of it's features). So if you have a full SQL server installed remove it and install it to a standard 2008 server. Exchange, like already was mentioned, always uses as much RAM as it can get, so if the RAM is being utilized don't worry, that is by design.
Run the SBS best practices analyzer and post results

http://www.microsoft.com/en-us/download/details.aspx?id=6231

You are running an Exchange server and you have no AV or anti spam running?
Recovered errors don't show up in the O/S logs, because the HDD doesn't report them up the food chain.  Decent diagnostic software can see them.  You can spend $500+ and get them, but why not just do the right thing and go software RAID1.

In software RAID1, you'll get a nice performance bump and most likely the problem will go aw ay.  While disk "A" is going through maybe a 3-10 second recovery process, disk "B" is supplying the data you need, and then the kernel will just fix disk A once the HDD returns from the deep recovery.

Look up dynamic disks RAID1.  

If you are using a redundant RAID configuration, then check the RAID controller logs.  If not, then just stop wasting your time looking elsewhere, and get your storage hardware in order.  

To be brutally honest, it is nuts to invest all the money and time into the server and even debugging such a problem if you aren't willing to spend a few hundred dollars on going to a RAID1 config for any server, let alone a SBS machine.
Some responses:

- I removed the AV software because it's expired and we are installing a new product.
- We are using a hosted spam/AV filter for email and I have locked down the firewall to only allow mail from that source.
- The server has an Adaptec SCSI RAID card with mirrored (RAID 1) configured.
- It's an HP ProLiant Server chassis.
- I know it's overkill for one guy, but he's wants it so I maintain it - his last IT guy vanished and now I'm trying to figure out why this thing is working poorly when it's lightly used.

Attached BPA results. Can't figure out where the DNS error is, I dug ALL around in the DNS manager looking for a rogue A record.
SBSBPA-All-Issues.xml
You are much better off using native win2k8 software RAID1, but doing so is going to require a bare metal backup/restore, and reconfiguring the controller for JBOD instead of RAID1.

But nothing in the log points to a RAID-specific issue.  The only thing I noticed is messages complaining about not enough free space.   (Which was tight enough to cause problems, so consider moving things around or getting more disk space.  As a rule, if the O/S complains about something, you should fix it)
Many many issues:
Computer Name Changed
Expired Certificates
Services played with, changed logon, startup type


it just goes on and on.

The server is being used by one guy with two email addresses. Most of the time, he's not even in the office and using the network. Just accessing his email.

So he's using exchange on this machine?
ASKER CERTIFIED SOLUTION
Avatar of David
David
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
He's running a BES server,the login account for backup exec has issues,sp3 for exchange needs to be installed,you need to renew the ssl,the Sharepoint logfile is very big and you're running out of disk space.

So it requires somebody to actually sit down and patch and configure it correctly.
The disk space that it's complaining about is the external backup drive. The C drive has plenty of space.

Yeah,  I just started patching and installing updates trying to get it up to date. I've told him that we need to talk tomorrow about a time/dollar limit on fixing things vs going to the cloud for Exchange. This guy is hugely paranoid about the security of his email and computers so it's a tough sell. I know - him leaving his server unattended and unmanaged for so long really doesn't sound like someone worried about security.

So, is all of causing the disk issue? I thought it might be a broken SQL or WSUS or something but this is a load of things to do. Might be cheaper for him to get a new server for files printers and then host his email.
Well, to wrap this up: We are replacing the server completely with a new one. Customer request. We also migrated to hosted Exchange for all 2 of his mailboxes and got that out of the way.
I've requested that this question be closed as follows:

Accepted answer: 0 points for srnowacki's comment #a38457066

for the following reason:

Because it wasn't the best idea, but that's what the customer wanted.
Few people here have suggested that disks/hardware was faulty. Few people also told to add RAM which the new Server probably will provide.

Also event he idea to use a outsourced mail was mentioned. These people should be given points.
I agree  #38335234  (but am biased). Still, this is a classic situation on knowing when to pick your battles.  It will take  more time and effort to identify and fix the problem then it would ever take to outsource a solution.

The 'best' solution is to solve the problem as quickly and inexpensively as possible.  That means outsourcing.
Putting in a new server is not the cheapest or fastest way to fix this.. reinstall from scratch, and outsource what you cannot comfortably manage yourself.  Proliant Servers run from medium desktop pricing on upwards..

#38335234  is a fair solution.