Link to home
Start Free TrialLog in
Avatar of itcroydon
itcroydon

asked on

SBS 2011 runs very slowly on a high spec server, which also causes the backup to fail.

A 1 year old SBS 2011 server with Xeon processor and 32GB RAM, has started running very slowly again. This last time this occurred was two months ago and after several reboots, with a change of AV, swap file location and C: drive space increase, it started working normally again.

After a further reboot due to more Windows updates ( which were successful and different from two months ago ), it has reverted back to running very slow again, which seems to also cause the SBS backup to time out with a 'Creation of the shared protection point timed out. Unknown error (0x81000101)'.

There is no obvious cause in the event viewer logs and there is currently 42GB free out of 120GB on the C: partition, whereas previously the space was around 10GB when the problem started, although disk space may be unrelated.

However about half of the VSS writers seem to be in a 'Retryable error' state (  when entering vssadmin list writers from a command prompt ), which will require a further reboot to at least temporarily resolve. It may be that the failed backup procedure then causes the writers to fail again.

In the processes area of Task Manager, CPU usage is between 90-100% most of the time and physical memory is around 84% usage. The top processes using memory are the Microsoft Exchange DB Store ( 11.5GB ), SQLserv ( 2 totalling 3.6GB ) and IIS Worker Processes ( 4 totalling 1.8GB ).

It is my understanding that Exchange and SQL can utilise any resources that are free and then release them when another process requires them, which may account for the high usage, but it is unclear what is causing the VSS writer errors. Also there are only 5 users, with no additional 'resource demanding' programs installed.

What steps can be taken to troubleshoot this further, with a view to improving the performance again, to stabilise the VSS writers and to stop the backup from timing out?
ASKER CERTIFIED SOLUTION
Avatar of Larry Struckmeyer MVP
Larry Struckmeyer MVP
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of itcroydon
itcroydon

ASKER

I couldn't match up the instructions in the link to the advanced AD options in SBS 2011; however I followed through the following article instead.
http://www.bursky.net/index.php/2012/05/limit-exchange-2010-memory-use/

Unfortunately this doesn't seem to have taken effect, but for the moment the store.exe size is less than 1GB. Also although CPU and memory usage are currently running at less than 50% after the reboot ( which did also clear the writer 'retryable errors' ), the server still seems to be running incredibly slow.

Also I have seen in the past where SQL can cause this type of issue on SBS, but unfortunately I cannot login to SQL Server Management Studio. It allows me to browse to 'network' and select the server name, but when Windows Authentication is selected and the new administrator details, it displays the following error message :

'a network-related or instance-specific error occurred whilst establishing a connection to SQL server. The server was not found or was not accessible'.

 I am not sure what logon details are required or whether these can be amended, as we didn't setup the server originally!
Further to my previous post, I have managed to right click on the sqlservr.exe process and set this to low priority, which has provided some performance improvement. Time will tell whether this is enough for the backup to run successfully.
Cliff, the top processes within CPU resource monitor are changing rapidly, but the main ones are :

Microsoft Block Level Backup Engine Service
Windows SBS 2011 Standard Console
Resource and Performance monitor
IIS Worker Process ( X 2 )
Rapportmgmtservice
SQL server Windows NT - 64 bit
Host Process for Windows services
Microsoft Office SharePoint Portal & Timer ( sometimes terminated )
Rapportmgmtservice is a client application. Why is that on your server?
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
I am suspecting that it is SQL related, although at the moment both CPU & Memory usage are below 50%, but the writer status is showing as not known. It may be however that this may change during the backup, which then causes it to fail.

I have managed to login to np://./pipe/mssql$microsoft##SSEE/sql/query to limit the memory used, but not specifically the monitoring SQL database and so I would be interested in having the script on standby.

Rapport and another surplus utility have now been removed.

The server will reboot just before the next backup tonight and so we will see if anything changes or not.
Incremental backups are great for saving time -- but please run a full backup job periodically.   The reason is that to recover from a full+incremental backup, you need to load the full, and *every* incremental up to your restore point.   If you go on for months without running a full, then you'll have a hundred, or hundreds, of incrementals to load.  This can not only take a long time, but -- more importantly -- if *any* of your incrementals can't be read, that's the end of your restore.  You can't get to a more recent restore point.

(Yes, it's possible that you'd been running incrementals before, and that you'd never noticed the performance problem... but after rebooting the NAS, the backup application could have decided it was time for a new full backup, and now you're finally noticing how bad performance always has been)

Now, what about the performance?  Use PerfMon in Windows to check your network and disk performance on the server.  
Do run a copy from the server with slow backups to the NAS and see what speed you get on a straight copy -- if it's also in the 10MB/sec range, perhaps your server NIC is connecting at a slower speed than it's capable of.  Or maybe it's the switch (how long since the switch has been powered down?  What happens if you move the server to a different port on the switch?).
Somewhere along the troubleshooting process, the performance has returned back to more normal levels and the backup no longer times out ( backup is via USB and so NAS / Network issues did not apply ).