SQL Server 2008 Performance Problems

Nigel_Taylor
Nigel_Taylor used Ask the Experts™
on
Hi Guys,

I am running SQL Server on Windows 2008 R2 Standard 64x With 8GB Ram and 2 Xeon Quad Core processors Running at 2.8Ghz. I have SQL Server Version is 10.0.2531. I ahve two boxes of this spec and have databases configured for mirroring.

This runs the backend database for our website. Recently we ahve seen a massive increase in the amount of pages that are timing our on IIS whenever a call to the database is made.

I have no experience of performance monitoring SQL Server and need to understand and address what is causing these time outs.

It may be that I just need to buy more RAM for the server but I want to know if that is going to fix the problem before I just buy it.

Information:
The sqlserver.exe process is running at 6.2Gb and the configured and running values are Min 0Mb and Max Server Memory 6656MB.

I have attached a screenshot of the Activity monitor. Not sure if that helps.

Thanks in advance.

Regards,

Nigel
SQL-Server-Activity.png
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
try this expensive CPU query - which I picked up from this site actually

SELECT TOP 10
qs.total_worker_time/qs.execution_count as [Avg CPU Time],
SUBSTRING(qt.text,qs.statement_start_offset/2,
(case when qs.statement_end_offset = -1
then len(convert(nvarchar(max), qt.text)) * 2
else qs.statement_end_offset end -qs.statement_start_offset)/2)
as query_text,
qt.dbid, dbname=db_name(qt.dbid),
qt.objectid
FROM sys.dm_exec_query_stats qs
cross apply sys.dm_exec_sql_text(qs.sql_handle) as qt
ORDER BY
[Avg CPU Time] DESC

Open in new window



A missing index improver query

SELECT 
            migs.avg_total_user_cost * ( migs.avg_user_impact / 100.0 ) * (  migs.user_seeks + migs.user_scans )   AS  improvement_measure, 
            mid.index_handle,
            mid.[statement],
            mid.equality_columns,
            mid.inequality_columns,
            mid.included_columns,
    migs.*, 
    mid.database_id, 
    mid.[object_id] 
FROM   sys.dm_db_missing_index_groups mig 
       INNER JOIN sys.dm_db_missing_index_group_stats migs 
         ON migs.group_handle = mig.index_group_handle 
       INNER JOIN sys.dm_db_missing_index_details mid 
         ON mig.index_handle = mid.index_handle 
WHERE  migs.avg_total_user_cost * ( migs.avg_user_impact / 100.0 ) * ( 
              migs.user_seeks + migs.user_scans ) > 10 
ORDER  BY migs.avg_total_user_cost * migs.avg_user_impact * ( 
                    migs.user_seeks + migs.user_scans ) DESC

Open in new window

Author

Commented:
Thanks for the post Deighton.

I have run those queries. Please see attached the results. I assume the CPU query it is a case of identifying the query and seeing if it can be improved and same with the Index one.

Nigel
Query-Results-All.xls
Sql Server 2008 also has great new reports for installations that use stored procedures extensively.  IF you pull up this report, you can scan down the column for total CPU time and find your high hitters. then expand and look at the fairly easily.  (I usually look at total before average, because if you have a process running thousands of times it pays to pay attention to it over one that runs just once a day at night)

Right Click on the Server name in Sql Server Management Studio

Right Click on Reports

Right Click on Standard Reports

Select Performance - Object execution Statistics



I'd also be interested in know whether you rebuild your indexes when they fragmented...
How to Generate Services Revenue the Easiest Way

This Tuesday! Learn key insights about modern cyber protection services & gain practical strategies to skyrocket business:

- What it takes to build a cloud service portfolio
- How to determine which services will help your unique business grow
- Various use-cases and examples

Author

Commented:
Hi SJCFL-Admin,

I have run that report and there is one clear object that is taking way more CPU time that all the other processes and that is at a value of 69.30. There is one other object at 11 and the rest are 3 or less.

Here is a screen shot of this particular object http://screencast.com/t/TrNQrIGW.

Regards,

Nigel
I cannot access that site.

But Sql Server 2008 has another great feature.  Once you know the object ,application and database name.  Which I am hoping you have from the screen shot you can get the data profiler involved.  (That is, if in looking at the procedures involved noting jumps to mind regarding missing indexes,etc...)

If you invoke the profile from the start menu, sql server 2008, performance tools, profiler - you use the data to specify filters to prevent massive amounts of data from being collected (and from slowing down your system tremendously)

Or if you bring up the activity monitor, there is an even better nifty feature.  Expand the processes and look for the application in question being executed for a specific user and that database.  Then right click on the line and you will get an option to trace the process in Sql Server Profiler!!    

(The first time you run, you may not get all the stats you want.  If not , stop the trace. Click on properties , then the events selection tab. I usually click on show all columns and show all events.  then I go through everything adding the additional data I want to see.  If you want additional data on something already preselected just deselect it and reselect it.

When you are done, run that!  THIS IS VERY IMPORTANT !!!  After you stop it and before you close out, you should save it as a Trace Template with a name.  AND after you save it re-edit it and click on USE AS A DEFAULT TEMPLATE FOR SELECTED SERVER TYPE.  That way the next time you right click and initiate a profiler session this will be the template thats gets used and it makes it much more likely that you will be able to quickly trap the info you need.

Another nice feature is that you can often copy the sql directly in the lower window. if the statement causing the problem is not an update, I paste it directly into a query window and run the analzyer to see the plan.  But many times once I look at the trace, I have apretty good idea of the changes I think need to be made....

Author

Commented:
Hi Guys,

Sorry for the delay in updating this I ahve been caught up with another production issue with a completly different system. Been a bad week....

Okay where I am at is I have run a query that has identified that the Indexes were 100% fragmented. This has been resolved now and we are back down to 0% fragmentation. But we are still seeing timeouts.

I have multiple websites hitting different databases all running on the same database engine and they are all timing out. Could it be that I need to be looking more at the DB engine than the individual databases to check there isn't more of an underlying issue.

Regards,

Nigel
A few things:

1. Do NOT leave the profile running!   Just wanted to make sure that you know this eats more CPU.  Only run it to identfy your problem, then turn it off.

This is a 'just in case...', you thought that leaving it on might be a good idea....

2. And if your index fragmentation was that bad, what is your physical drive fragmentation?

3. Are the statistics on all your indexes current?

4. Do you have auto statistics update set on?  I like this set off when I have high update activity on large tables.  I use jobs to check fragmentation and when needed rebuild and redo stats after the rebuild.  That prevents the stats collection from kicking off during prime time.  Plus I had one third party package that seemd to go into a full re-compile cycle of all its procedures when it detected a stats change. THAT really impacted one of my servers...  Caught the activity when I was monitoring the activity monitor (Resource Waits) while the system was having a CPU spike and saw the compilation wait category go through the roof...
Sorry,  I went back to your original post and saw that you had checked the resource waits and it was on 'Other' which is generally a non-sql related activity.

I have had one odd occurance that exhibited this symptom.  We have a third party package that handles some of our wireless traffic.  In order to give the user a better experience, it 'keeps' the connection live even if the user goes into a dead spot with the expection that they will reconnect transparently.  The downside was that this was causing my data replication process to initiate sucessfully thinking there was a connection and then it waited, and waited and ....   Normally you would think, so what?  It will time out and resolve itself...  But while this condition occurred, I noticed that my CPU activity always spiked.  And the replication cycles were frequent.  

I figured out a work around for my replication cycles to detect a real connection before the real cycle initiated.  

One consultant thought that adding more physical NICs would have also resolved the problem.  But I dislike having to add hardware to support non-existent connections.  It just seems wrong!
Another situation involved an application that was not cleaning up after itself and was initiating new connections for every database call instead of detecting an existing connection and re-using it.  The other connections remained active consuming memory and resources until the transaction completed.

The resolution was to work with the application and have it changed to re-use established connections whenever possible.

Author

Commented:
Hi SJCFL-Admin.

We ahve had a major break through. The nightly import that runs basically has been appending data to a data and not clearing it down when expected. So a table that should have ~300 records had ~6.7Million. I have cleared the un-needed data and hay presto the performance is as I would expect.

One problem that I have though is the Memory Usage  on SQL Server is at 6.4Gb and there is 6.5Gb allocated for SQL Server to use and there is only 8GB physically in the box. I can see that the Ram usage increases during the nightly data import to 95% but then dosnt decrease. I believe this is linked to the TempDB which is 9131MB and space available 9126Mb. Is there anything I can do to shrink the saize of this database without taking the server down?

Nigel
RAM usage should be dynamic. I'd have to go back and read the exact details, but I thought that memory pressure is what causes it to be allocated or released.  Since you had 6.7 million rows in a table, could your system have been loading them all into a huge buffer since if found all that excessive memory just waiting to be used?  And if it had no one else demanding it, would have kept it?  

TempDB is another animal.  It is in constant use.  I would not recommend even trying to shrink it during normal up times.  You found your main problem. It is time to take a deep breath and not cause a different problem by trying to be perfect ;-)
One item I would check however, is what is happening with your transaction logs.  I have had problems with applications adding significant data with recovery set to 'FULL' without taking transaction log backups intermittently.  So the actual size of tempdb may not really be your problem. It may be what is causing temptdb to be used to such a degree.  If there is no data used, it won't be loaded in the buffers and hence should not put any pressure on memory...  (my opinion only...)

I should have explained further about the need for intermittent log backups.  When you hold off and get a huge log backup at one time that large lump of data causes a spike in usage of dasd, memory, etc.  So whenever possible,  prevent it by taking intermediate log backups to handle smaller chuncks at a time and free log space. or chose and alternate recovery model.

Commented:
<< I can see that the Ram usage increases during the nightly data import to 95% but then dosnt decrease.>>
The maximum memory allocated for SQL will not be released immediate it just keeps pages longer in it's buffer cache. 6,5GB of 8GB is already 80%.

You have cleared all those records. Thought about index rebuild, update statistics
Senior DBA
Most Valuable Expert 2018
Top Expert 2014
Commented:
>> I can see that the Ram usage increases during the nightly data import to 95% but then dosnt decrease. <<

In this situation that's actually a GOOD thing.

As noted above, SQL will acquire memory if/when needed, but not automatically release it back to the O/S.  That's because mem acquisition and release is a relatively expensive process.  SQL doesn't want to get mem, release mem, have to re-get mem, etc..

However, if mem pressure elsewhere on the box occurs, SQL may be forced to release mem.

That hasn't happened here.  That is GOOD, because it means you don't have mem pressures outside SQL on that box.



>> Is there anything I can do to shrink the size of [tempdb] without taking the server down? <<

That's tough.  Tempdb almost never shrinks successfully on a live system.

You could try some very limited shrinks, particularly if you have multiple data files (which you should have).  Understand that there is STILL risk with this, and you may cause transactions to fail, and MIGHT even be forced to restart SQL.

For example, say you have 8 tempdb data files 1.2G each.  You could try shrinking all to 1175M.  Then to 1150M.  Etc.  

IOW, if you do try to shrink live, do it *very incrementally*, which gives you the best chance for success.
lol,  I sense a... 'It seemed a good idea at the time" .... coming
Scott PletcherSenior DBA
Most Valuable Expert 2018
Top Expert 2014

Commented:
I've done it before.  It depends on the workload mainly.  Do it at lowest activity for tempdb.

A shrink doesn't inherently destroy data.

Author

Commented:
Ok great. Thanks for the advice.

SQL Server is sitting just under 6.5Gb and there are no other applications or processes running on the server and I have configured SQL Server to run at 6.5GB RAM usage allowing 1.5GB for the rest of the server to opperate. So would it be right in saying then that if I were to increase the physical amount of RAM in the server I can increase the allowance for SQL Server and then ultimately that will speed query process up?

Nigel
It will only spped query processes up if you are seeing memory waits in the activity monitor.  If it is not under memory pressure than adding memory may not help all that much.  But the more memory you have, the more indexes, data pages, etc can stay in buffer preventing I/O which is slower.  So it usually is the best bang for the buck.

But generally speackiing, if you want to improve perforamcne, look at your waits and remove whjatever is the impediment.

Author

Commented:
Excellent. Thanks for that I will go out and order some more RAM now. Just looked up less than £100 to take our servers from 8Gb to 16Gb.

Nigel
Scott PletcherSenior DBA
Most Valuable Expert 2018
Top Expert 2014

Commented:
Yes, usually RAM helps most (up to a certain point, of course).

With only 8G, adding more should definitely help by adding data buffers.  It may also help retain data buffers by having additional RAM available for locks info.

Then, as SJCFL noted, you will need to look at waits / wait stats to determine the best step to take next.

Also, I suggest setting a reduced MAXDOP in the SQL configuration; by default, SQL can theoretically use every CPU for one task.  Otherwise, you often get a lot of CXPACKET waits, as SQL often isn't great at using multiple CPUs on one task.  (After the change, check for CPU wait issues to make sure you haven't reduced MAXDOP too low.)

But, for the heavy query you noted above, explicitly allow it to use more parallelism by explicitly specifing "MAXDOP=<higher_number>" on that query.

Author

Commented:
Thanks Scott,

Here is a view of what is happening in the activity monitor at the moment.Would you say that  those figures are what you would expect to see??

http://screencast.com/t/StMe3Wgj

Nigel
Scott PletcherSenior DBA
Most Valuable Expert 2018
Top Expert 2014

Commented:
Doesn't look the system is busy much.

You might want to look at the pie chart on the "Other" waits to get a better idea there.

Buffer i/o does seems to be a factor, and more RAM should definitely help with that.

Author

Commented:
Hi Scott,

Sorry i'm still new to all this. How do I get the pie chart for this specific Resource wait?

Nigel

Author

Commented:
Thanks Guys,

Adding the additional RAM and doing what you suggested seems to have resolved our issues.

Regards,

Nigel

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial