Performance problem in SQL Server 2005, what to do?


I am investigating one messed up system here it seems where the SQL Server 2005 64bit database machine has 14 GB of memory, 4 dual cores CPU's and SAN disk setup configured with RAID5. The database is around 52 GB in size. The CPU usage seems to be considerable low on average. Database has one data file and one log file.  

I am not the most knowledgeable person when it comes to SQL Server 2005 so any advice/recommendation is highly appreciated.

I am suspecting that the slowness in the system could be caused mainly by internal and external fragmentation of tables/indexes. Executing the following query I see that lots of tables/indexes have from 60%-98% average fragmentation.

SELECT object_name(i.object_id) AS tablename,                   AS tableindexname,
FROM   sys.dm_db_index_physical_stats(db_id(),NULL,NULL,NULL,'DETAILED') phystat
       INNER JOIN sys.indexes i
         ON i.object_id = phystat.object_id
            AND i.index_id = phystat.index_id
WHERE  phystat.avg_fragmentation_in_percent > 30

I ran the SQL Profiler for few hours while database was performing at its worst with large number of users concurrently connected. Then we analyzed the trace file with the Database Engine Tuning Advisor and there were hardly any recommendation.  Taking some of the individual queries from the trace file showing bad response time and running them against the Database Engine Tuning Advisor I got recommendation of about 90% improvement in response time, but after applying those indexes and statistics the response time only improved by about 10% or less.

The disk fragmentation shows also 98% fragmented.

I would appreciate if someone would help me come up with an action plan or better way of investigating:

-      Should we rebuild or reorganize indexes? If rebuild and reorganize, which first?
-      Since the server has 4 CPUs should we recreate the database with 4 data file, or does that only apply when you have 4 disks (
-      If database would be recreated with 4 data files and 4 log files then what is the best way to move data from old database to new one?
-      Knowing that statistics have never been gathered for the database, is there a rule of thumb how statistics should be gathered and how often then?
-      Is there another better way to tackle/analyze a performance problem in SQL Server 2005 assuming that the application should perform fine with all its indexes in place and the number of concurrent users?
-      Does anyone have a good script that could prove that the fragmentation could be causing a lot of slowness in database, if then that is the case?

Best regards and thanks in advance!
Who is Participating?
Mark WillsTopic AdvisorCommented:
OK, a quick helicopter view of fragmentation, both disk and SQL have similar parallels, starting with similar foundation of page (in SQL) and sector (on Disk). Looking at your stats, there is a fragmentation issue that maybe causing problems.

Two basic types of fragmentation, Physical / Disk fragmentation, and Internal SQL Database.

The two "biggies" in SQL are Index Fragmentation and Heap Fragmentation (ie tables without a clustered index or Primary Key).  Those two big ones we discuss in SQL speak is really all about page management - everything SQL revolves around the humble 8K block known as a page, and groups of 8 of them known as extents - we will probably not get down to that level of detail, other than to be aware of the basic building blocks.

Physical Disk fragmentation is more of a problem with the classic IO sub-system architectures, and much less of a problem on the more sophisticated IO system solutions such as SAN. If you do have high disk fragmentation within your SAN system (and would really be the raid array or LUN allocations), then need to start looking at configuration and hardware. Disk systems are made up of building blocks known as sectors, and collections of sectors known as clusters. Just like SQL with pages and extents. Now there is a rule of thumb that suggests as you approach 80% usage of capacity, then IO performance will start to significantly degrade - even on SAN, so be aware of your capacity and available space at all times.

Fragmentation, in all cases, is caused by one of two things. The first is the inability to provide sufficient contiguous space at that point in time when it is needed forcing it to consume non-contiguous (or fragmented) space due to nothing more than availability. This inability has the biggest impact on performance and in turn, causes two types of fragmentation, "external" and "data" fragmentation (more later). The other cause of fragmentation occurs when the (Disk or SQL) system is unable to store data with fine enough granularity such that all space is utilized to maximum efficiency, leaving small pockets of space. The only real impact in that situation is wasted space more so than anything degenerative to performance, so long as that is all it is - small pockets of unused space, but can mean an increase in number of pages (or sectors) needing to be cached. An example and possible exception is the use of Fill Factor in SQL where you take control over the page "density" to accommodate future inclusions, effectively reserving free space, but is very dependant on your database / table design with structure of indexes.

"External" fragmentation applies to the free or available space. We want free space to be kept contiguously so that data can be written contiguously and without fragmentation. However, if segments (such as a cluster on disk or pages in SQL) of data are removed at arbitrary places (as in a delete), then the free space is automatically fragmented as soon as new data occupies that space. In that case, defragmentation means shuffling all that free space (or moving occupied space) into contiguous segments.

"Data" fragmentation occurs due to external fragmentation. If there is external fragmentation, this means that the data may be fragmented well. It becomes fragmented as new data elements are written into available spaces, and contiguous blocks of information contain pointers to the next physically available space.

Fragmentation is further exasperated by type of access to data segments (regardless of Disk, Database, Indexes etc). Random access is less dependant on unfragmented space, and actually leads towards fragmentation, where as Sequential data is dependant on unfragmented space and the victim of fragmentation.

Within SQL, there are a few tools we can use to help with fragmentation. A lot of it will depend on getting access to a few statistics, and some of it will depend on knowledge of the business and the data. We can help with the stastistics and highlight some of the options, but you will need to learn about the dynamics of the business and the databases in use. For example, a periodically updated, largely "read only" database like a data warehouse or reporting database will have different requirements on how you manage that database allocation (and probably recovery model consequently affecting log files) versus a considerably more random database such as an OLTP style database (with a need for greater recoverability and consequently more dependance on log files). Then within any one database, need to understand the inherent behavious of Transaction Logs (being sequential in nature) versus Data files (being random in nature), and the placement of those files onto a disk sub-system and even down to the type of RAID used for the different aspects of data being written to disk. Kind of a catch-22 in some regards, but careful plotting and planning makes it all pretty mechanical at the disk level with a few very popular rules (such as seperate disk systems/ storge arrays for system, tempdb, logs and data).

For SQL, it is vitally important to manage and resolve Physical fragmentation before attempting any resolution of internal SQL fragmentation. Fortunately, SQL gives us a configuration tool to help with Disk fragmentation by being able to allocate space in advance of it's use. Unfortunately, there are lots of situations whereby "shrink database" and "autogrow" operations are going to unravel and promote disk fragmentation by subsequently growing the database reactively rather than by design reserving free space in advance to accommodate predicted growth.  It is extremely important to do some homework and look at typical growth and then allocate the space required to accommodate normal / predicted growth that is for both the Data and the Log files that make up the Database. Then when you do defragment the disk, there will be good continuity of space available and should be possible to resolve one hugely manageable aspect (and cause) of fragmentation.

OK, we are now (finally) getting into the nitty gritty of SQL itself.

First of all, need to understand that fragmentation might not be as much of a problem as you may think. If data is largely random, with optimised queries that can achieve a good hit ratio, then there is no real problem. Conversely, an index based on dates with say 100 pages covering 1st Jan 2000 through to today but the last page happens to have entries covering dates in 2000 will not be shown as fragmented (or at least the stats will be only a percentage point ie 1 in 100) in reality doing an index rebuild effectively shuffles nearly the 100 pages - or a logical fragmentation of 90 plus pages.

So, while we can readily get the statistics on fragmentation it will be your knowledge of how the business works and the performance of various queries that will be the most likely indicators that fragmentation (within SQL) is causing a major problem.  A good example of that is the use of say Natural Keys - look at a customer / order relationship where customer codes are evenly distributed, and equally active, then in a natural key of customer + order there will be significant inserts for each customer which will result in index page fragmentation due to data design alone - fortunately in this case, we can set a fill factor to help with that non sequential growth (acroos the range of values), and, the index pages probably do not relate from one customers order to a different customers and so the index is pretty much random in nature and so fragmentation per se need not be a problem, albeit present.

Taking a step back for a moment, fragmentation in SQL is not just about index fragmentation. With SQL it is important to understand the difference between a table with a Clustered Index or Primary Key over ones that don't. The former can be called a clustered table, and the latter is called a Heap table. A clustered table provides a few benefits over a heap such as physically storing the data based on the clustered index, the ability to use the index to find the rows quickly and the ability to reorganize the data by rebuilding the clustered index. A heap table has no particular order of data, and must always refer back to the IAM (even sequential access), and whilst it can have non-clustered / non primary key indexes, random access is painfully slow without some kind of indexing, even the, it is no where near as efficient as a clustered table. Now, a clustered index has inherent pointer to row-data. Any subsequent indexes can use this pointer or key-value in their own indexes to point to the underlying data. Guess where all this index activity is kept - that's right, on those basic building blocks called pages. Obviously the more compact and "dense" the index information is on a page, the more data can be pointed to, and similarly, the more contiguous pages the greater the caching in one extent, improving caching, and minimising the probability of swapping out one page for another.

So, what makes fragmentation ? Data being laid down, or index trees being updated in non-contiguous manner. Depending on the INSERT, UPDATE and DELETE activity against your tables your physical data can become very fragmented. If only ever adding new chronologically correct data, then fragmentation is less likely to occur at the heap or data level. The real impact is non sequential inserts, updates (particularly if updating members of an index) and deletes (leaving "space" or wholes in otherwise contiguous data). Just as a side note that is one huge attraction with using surrogate keys such as identities or sequential GUID's as primary keys and clustered indexes - they will be sequential and so reduce one big aspect of fragmentation (the pages can still become fragmented).

A query I often use to help identify fragmentation is :

SELECT object_name(s.object_id),s.index_id, name, index_type_desc, avg_fragmentation_in_percent
FROM sys.dm_db_index_physical_stats (DB_ID(), NULL, NULL, NULL, NULL) AS S
INNER JOIN sys.indexes AS i ON s.object_id = i.object_id AND s.index_id = i.index_id

Again, this will show and reflect statistics which you must now interpret as being "significant" and really does require supporting statistics from perfmon (such as Page Life Expecancy and Cache Hit ratios, and Queue lengths) to determine the real impact. Generally, without the extra insights, best to err on the conservative and treat large numbers as a probable bad thing. The real art in this case is to ascertain not if there is fragmentation, but, if fragmentation is causing problems.

Now, fixing fragmentation for indexes is pretty straight forward. Simply rebuilding or reorganizing the index will fix the problem. Heap is a lot more difficult to fix, but fixable if it has to be... Best possible solution is to simply create a clustered index on either an existing identity, or add an identity column (can remove it after if really need to). Otherwise, you will need to get into manipulating or manageing the data as a bulk process by either exporting, truncating and re-importing, or copy to a new table and rename etc. Like I said, fixing fragmented heap space can be more difficult.

Fixing fragmentation is an orchestrated set of events, and can largely be automated...

1) Set optimal database size allocation, do not shrink unless really needed, If you have the space, be generous rather than conservative.
2) defrag disk once databasize size is set (or changed). Look at realistic autogrowth settings.
3) Run SQL backups - Run transaction log backup frequently to help growth factors and release "dirty" transactions.
4) Keep "external" fragmentation down. Reorganise your database moving used pages to the front, releasing unused pages.
5) Use statistics to help identify internal fragmentation and use REBUILD or REORGANIZE. Start with Clustered indexes first, then move on to non-clustered.
6) Ascertain the thresholds where rebuild is required over reoganise, rebuild can take some time, a lot of people use a 60-40 rule.
7) Set up Maintenance plans to do all the above and have them run frequently enough to manage any possible fragmentation
    a) nightly backups
    b) daily (say 15 minutes, half hour, hourly) transaction log backups (in line with your DR plan)
    c)  weekly review / set database sizes, fix physical fragmentation, rebuild/defrag indexes, update statistics
    d) periodic archiving of Old data, keep it clean, keep it relevant
    e) wash, rinse, repeat...

Really what you want to do is to set up a maintenance window where you can look at restructures and if needed run weekly or monthly in accordance with business processes (such as end of month purges / archving historical data), then use those maintenance plans to fix disk allocation, fix disk fragmentation, reorganise pages, rebuild indexes (if time permits, otherwise might need to be reorganise), update stastics - all performed in that sequence.

1) yes, you can rebuild indexes online so i don't see a reason why not to (if it's true that your cpu is not overloaded)
2) i don't think that creating 4 files on the same disk will benefit you
4) that is not true, if auto statistics is set to on, then statistics are gathered automatically. otherwise you can turn it on and sql server will collect statistics for you
5) the is a dmv i like called sys.dm_exec_query_stats (or something like it) which shows for every sql statement how much resources it consumes. I would use that dmv (read about it in the books online) to see the top 10 consuming queries and start with them
6) check out dm_index_operational_stats
Mark WillsTopic AdvisorCommented:
Well, rebuilding indexes online does need the Enterprise version I think...

If there is that much fragmentation, then yes it is going to cause a problem. Surprised to see that a SAN is reporting so much fragmentation, part of the benefit of having a SAN is significantly advanced disk management.

Raid 5 is not the best for write performance, and if configured as the only "logical" array of 4 disks for that server, even on San, then doing a bit more competing for IO than should be neccessary. Making seperate tempdb's per processor is really per physical processor, and given your disk scenario, would not be doing it.

What you can do, it use the Maintenance plans... Set up at least three

1) Full (nightly) backup
2) Transaction logs (quarter or half or hourly) backups
3) Maintenance - Index rebuilds, Statistics,  Reorg pages, check database sizing - probably weekly when there is an opportunity to build indexes - it can take a while.

With a database your size, would suggest using the maintenance plans more so than auto-stats, try to beat the beast into your control, running the way you want it to...

Before you tackle internal SQL fragmentation, need to address that disk fragmentation. And also make sure that you have caching set on for the SAN / Raid array. The you can tackle index fragmentation.

As a general tool, you will need to gather some statistics with perfmon. Probably the biggies will be Cache Hit Ratio : SQL Plans should be better that 70% and Buffer Cache Hit Ratio should be better than 95% and page life expectancy should be better than 300, then there are the disk queue lengths and processor queue lengths. So there are a few stats worth gathering...

Have a read of :

After reading the above, you might see that this is potentially a big, big topic, so is there something more specific you would like to know about ?
Cloud Class® Course: C++ 11 Fundamentals

This course will introduce you to C++ 11 and teach you about syntax fundamentals.

You might want to look at contig.exe; free download from Contig.exe makes a file contiguous, so you can defragment a .mdf, etc.
peturgretarsAuthor Commented:
Thanks to all comments so far.

I will have a thorough read through all those answers. Hopefully I can come up with a good action plan based on those. System is just so slow that some applications are crashing from time to time because of bad performance in database and I am trying to figure out where and what is the best way to start to tune the database. I need to get some improved performance soon. How can I find the tables/indexes that should be rebuild first for example or is there something else you would start with. I have already taken one of our biggest table and rebuild its indexes, it took some time and hasn't shown significant change.

Appreciate to hear how and where you would start tuning a poor database performance in a 24/7 system knowing that you can't change the code/queries.

Mark WillsTopic AdvisorCommented:
Well, would start with indexes and statistics, and will be hard to physically defrag if you cannot take it offline - it will be locked.

Would look into some of the more popular queries if at all possible, and try to indentify any slow running queries, or heavily locking queries... Then using those as a basis, look at : and walkthrough the first topic area (the checklist)...

But you will need those statistics - they will help identify IO bottlenecks versus CPU bottlenecks and Memory.

Don't forget, it could also be hardware / network configuration issues as well (even things like byte-level virus checking can kill a machine).

Consider the following chart, and also have a read of the more detailed Word documents from Microsoft :

Performance Waits and Queues :
Trouble Shooting Performance :

peturgretarsAuthor Commented:
Few things I would like to get clarified

1. As this is a multi processor (4 x Intel(R) Xeon(R) X5355 @ 2.66GHz) with server and from what I have read on the internet I understand that it would be best to create a new database with as many data files (all  the same size) as you have processors and transfer the data from the old database into this new one. See for example

Momi Sabag, you say that this won't benefit me. Why is that? And then what about the tempdb, does the same thing apply there as well?

2. I should have mentioned that the first things we noticed was that database autogrowth was configured as "By 1 MB, unrestricted growth"  Isn't this a sure indication that the database could be very heavily fragmented as we have seen already and could it be that rebuilding indexes isn't  really enough to fix that situation?  

3. If rebuilding indexes isn't enough, what do you recommend to do? Is there a way to rebuild the data inside the datafile?
Mark WillsTopic AdvisorCommented:
1. yes that is a school of thought to get some parallelism happening with the processors, with a view of multi-tasking the IO. But it really does depend on so many things, and really getting towards the last tactical move, there are a few considerations like hyperthreading to take into account, and playing with the affinity mask. And really are you seeing big CPU problems or any one CPU being thrashed ? Do you have IO bandwidth to manage more computing power throwing itself at the Array ?

2. Database and Transaction logs should be examined and adjusted for optimal size to take into account "normal" organic growth. There will always be the odd occasion where some significant event with through out the size, but they are the exception and should be treated as such. A reasonable growth is 10 percent - assuming you do have disk.

3. de-fragmenting heap data is challenging. Heap tables are those without indexes. Easiest way to defragment is to add an identity column as a clustered primary key and that will keep it "clean" without having to change too much of the rest of the database (ie should be all but transparent - might need to change some insert statements if you have an insert statement without specifying column names).

You must gather some statistics to ascertain where the real problem is.

It could end up being the raid controller queing Disk IO - are there any error messages on the SAN / Raid logs ?

peturgretarsAuthor Commented:
Attached is a result from the following query and 9th (I) column shows the avg_fragmentation_in_percent:

---These indexes should be rebuilt, not reorganized
--- Alter Index ... Rebuild
SET @dbname = ' PRODDB'

select *
from sys.dm_db_index_physical_stats (db_id(@dbname),null,null,null,'DETAILED')
where avg_page_space_used_in_percent  > 50

Would be interesting to hear your opinion on this result.
Thanks so much!

peturgretarsAuthor Commented:
Thanks so much for these excellent answers you provided!
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.