<

[Webinar] Learn how to a build a cloud-first strategyRegister Now

x

Managing Fragmentation for the Accidental DBA

Published on
23,477 Points
5,177 Views
13 Endorsements
Last Modified:
Awarded
Mark Wills
Love to Help
Give a man a fish and you've fed him for a day; Teach a man to fish and you've fed him for the rest of his life. Be the teacher
By Mark Wills

We often hear about Fragmentation, and generally have an idea that it is about broken bits, or bad for performance, or at least, is generally not a good thing. But what does it really mean ?

By way of analogy, think of the humble Telephone Directory. It is nicely laid out, sequentially in alphabetical sequence by Name. Think about receiving the next edition, not by replacing the directory, but by giving you the added or altered pages which you append to the end. After a few updates, you will soon find that the end of the directory makes no sequential sense at all and you spend an inordinate amount of time trying to find that elusive phone number. It has become fragmented and your directories performance is most likely measured in terms of how long it takes before it hits the bin.


Types of Fragmentation

With SQL Server systems there are two basic areas of fragmentation: Disk fragmentation, dealing with physical files, and SQL Database fragmentation, primarily dealing with how data is stored inside that physical file. There are a lot of parallels, and we will need to look at both.

Disk fragmentation is more of a problem with the classic IO sub-system architectures, and much less of a problem on the more sophisticated IO system solutions such as SAN. Disk systems are made up of building blocks known as sectors, and collections of sectors known as clusters. When physical files occupy non-contiguous space, or, the growth of a physical file means that it needs to hunt down the next available sector, then it becomes fragmented.

The big issue we often hear about within SQL is Index Fragmentation. But there is also Heap Fragmentation (i.e., tables without a clustered index or Primary Key) and potentially more of a problem. Those areas of fragmentation we see and discuss in SQL is really all about page management - everything SQL revolves around the humble 8K block known as a page, and groups of 8 of them known as extents - we will probably not get down to that level of detail, other than to be aware of the basic building blocks. Conceptually it is similar to Disk with pages and extents akin to sectors and clusters. Not surprisingly, the next few paragraphs, equally apply to Disk and to SQL fragmentation.


Causes and Areas of Fragmentation

Fragmentation, in all cases, is caused by one of two things. The first is the inability to provide sufficient contiguous space at that point in time when it is needed. The second is Data elements become segmented by occupying non-contiguous or fragmented space due to availability. This inability to provide contiguous space has the biggest impact on performance and in turn, causes two types of fragmentation, "external" and "data" fragmentation.

"External" fragmentation occurs to the free or available space. We want it to be kept contiguously so that data can be written without being broken up into different locations. However, if data segments (such as a sector on disk or a page in SQL) are removed at arbitrary places, then the free space is automatically fragmented. In that case, defragmentation means shuffling all that free space (or moving occupied space) into contiguous segments, and conversely, making the available space nice and contiguous as well.

"Data" fragmentation occurs as a result of external fragmentation, but can also occur within a contiguous allocation. It becomes fragmented as new data elements are written into available spaces, and contiguous blocks of information then contain pointers to the next physically available space. This will happen with volatile data, i.e., data that undergoes addition, changes, and deletions - such as a database.

There is also "Internal" fragmentation. This occurs when a data element is unable to be stored in such a way that all available space is used to maximum efficiency. The only real impact in that situation is most likely wasted space more so than anything degenerative to performance. However, needing more space to occupy less data does have some impact on caching. An example and possible exception is the use of Fill Factor in SQL where you take control over the page "density" to accommodate inclusions, effectively reserving free space.


Data Access and Other Considerations

Fragmentation can be further affected by the type of access to those data segments (regardless of Disk, Database, Indexes etc). Random access is largely independent on fragmented space, and actually leads towards fragmentation (i.e., small discrete blocks of data being independently read and/or written). Conversely, Sequential data is hugely dependant on contiguous space and is very much the victim of fragmentation.

For SQL, there are tools to help with the statistics and to manage the various options, but it is of vital importance to learn about the dynamics of the business and the databases in use. For example, a periodically updated, largely "read only" databases like a data warehouse or reporting database will have different requirements on how you manage that database versus a considerably more random database such as a "typical" transactional database. Let's consider our Telephone Directory again, but this time as a database. To use it on line, it becomes hugely random. To print a new book, it will be hugely sequential.

Let us take a moment to reflect on what makes up a database. A SQL Database is made up of three types of file, the two main files are the primary data file with the extension of MDF and the transaction log file with the extension of LDF. The third is an optional secondary data file and has the extension NDF. As users add information into the database, it is stored in the data file. In turn, those transactions and database modifications can be recorded in the Transaction Log.
 
Transaction Logs are sequential in nature (data continually added in a chronological sequence) whereas Data files are essentially random in nature (individual gets or puts from unrelated simultaneous concurrent users). So, from a physical perspective, the types of files in a database means that they are likely to be affected by fragmentation in different ways. Those differences in file types also influence the placement and sizing of those files onto a disk sub-system, even down to the type of RAID and caching methods being used. Kind of a catch-22 in some regards, but sitting, plotting and planning makes it all pretty mechanical at the disk level.

The database files are simply large physical disk files and are built with an initial allocation of disk space. That means, you can reserve a pre-determined size for your database, and then set about filling it up. When creating or maintaining a database, your allocated disk space can be easily controlled, just a matter of knowing how big. Consuming that disk space can get a little more involved and is largely reactionary to the amount and type of activity.


Size your Database

There are a couple of settings in the Database settings which are meant to help with sizing the database, but are potential catalysts for fragmentation. They are Autoshrink and Autogrow. Immediately, without thinking, right now, disable Autoshrink - enough said, thank me later. Autogrow sets the amount of disk that the database files will grow by. When a database or log file is filling up the allocated space, it will decide at some point that it needs more disk and the database engine will grab the space as prescribed in the Autogrow settings. Given time, and maturity of the database, the size and growth rates will become more predictable, ideally to the point were the growth is fully managed and Autogrow is an exception.

Unfortunately, "shrink database" or "grow database" operations are going to unravel and promote disk fragmentation by subsequently and dynamically changing our carefully planned size for the database, unless they are being very deliberately and very specifically performed. That is why we need to disable Autoshrink, and minimise Autogrow instances. If it does need to Autogrow, then you do need to make that incremental growth worthwhile to minimise the number of times the database is being extended. You can apply different growth factors to the individual database files.

Fortunately, SQL gives us the one tool that has maximum impact to help combat Disk fragmentation by being able to allocate space in advance of its use. It is extremely important to do some homework and look at typical growth and then allocate the space required to accommodate normal / predicted growth that is for both the Data and the Log files that make up the Database. When first starting, you will need to determine (erring on the generous side) the amount of disk space that will be required, and monitor that usage so you can then determine the most appropriate size.

For Databases, because they are simply big files, it is important to manage and resolve disk fragmentation before attempting any resolution of internal SQL fragmentation. Like any building, you need solid foundations before you contemplate furnishing the penthouse.  So, create your databases by carefully and deliberately sizing the physical files according to the nature and type of database and associated activity, even to the point of choosing a location (such as different disks and/or multi RAID systems) according to the underlying nature of the individual files of the database. Then once you have defragmented the disk, there will be good continuity of space available and resolves one hugely manageable aspect (and cause) of fragmentation.

There are plenty of tools and utilities around for physical disk defrag, such as contig ( see http://technet.microsoft.com/en-us/sysinternals/bb545046.aspx ). Windows own defrag tool is a cut down version of Diskeeper. On that note we will leave Disk fragmentation alone.


OK, we are now (finally) getting into the nitty gritty of SQL itself.


Internal SQL Fragmentation in Detail

First of all, you need to understand that internal SQL fragmentation might not be as much of a problem as you may think, and vice versa, i.e., a database member maybe more fragmented than statistics might first suggest. If data is largely random, with optimised queries that can achieve a good hit ratio, then there is no real problem. Let's revisit our Telephone Directory Database. For on-line, interactive use, it will be hugely random access and so, fragmentation is not necessarily a problem. If printed out as a book, it becomes sequential and fragmentation is potentially a huge problem.

While still with our Telephone Directory, let us consider having received the first 2 pages of updates (appended to the end) showing a few new entries per alphabetic letter. Now, the few thousand pages (lets say 2000), is nicely structured, and the last 2 pages contains entries from all over the place. Using our tools, it may not be shown as fragmented (i.e., 1 page per thousand) but, in reality, doing a sequential scan, or an index rebuild will mean it shuffles nearly all the pages following the first instance of the letter "A" from the update pages, so it really is hugely and "logically" fragmented.

So, while we can get all the statistics on fragmentation, it will be your business knowledge and the performance of various queries that will be the most likely indicators that fragmentation (within SQL) is causing a major problem. In our Telephone Directory, what if we were to only permit lookup and add of new phone numbers - the interaction is only random in nature, and fragmentation per se may not really be a problem, albeit present.

At this stage we really do need to qualify some of those scenarios above.

With SQL it is important to understand the difference between a table with a Clustered Index or Primary Key compared to tables that don't. The former we will call an Indexed Table, and the latter is called a Heap table. An Indexed table provides a few benefits over a heap such as physically storing the data based on the clustered index, the ability to use the index to find the rows quickly and the ability to reorganize the data by rebuilding the clustered index. A heap table has no particular order of data, and must always refer back to the IAM (even for sequential access), and might have non-clustered indexes, otherwise random access is painful. Now, a clustered index has inherent pointer to row-data. Any subsequent indexes can use this pointer or key-value in their own indexes to point to the underlying data.

Where is all this data and index structure is kept ? On those basic building blocks called pages. Obviously the more compact and "dense" the index information is on a page, the more data can be pointed to, and similarly, the more contiguous pages the greater the density in one extent, improving caching, and minimising the probability of swapping out one page for another.

So, what leads to fragmentation ? It all depends on the INSERT, UPDATE and DELETE activity on your tables with your physical data and how it is distributed across the range of index keys. If only ever adding new chronologically correct data, then fragmentation is less likely to occur at the heap or data level - such as the transaction log. The real impact is non sequential activity, especially updates (particularly if updating columns which are also members of an index) and deletes (leaving "space" or wholes in otherwise contiguous data).

Just as a side note that is one huge attraction with using surrogate keys such as identities or sequential GUID's as primary keys and clustered indexes - they will be sequential and so help with one big aspect of fragmentation.

A query often used to identify fragmentation is :
SELECT object_name(s.object_id) as table_name
     , s.index_id, name, index_type_desc, avg_fragmentation_in_percent
FROM sys.dm_db_index_physical_stats (DB_ID(), NULL, NULL, NULL, NULL) AS S
INNER JOIN sys.indexes AS i ON s.object_id = i.object_id AND s.index_id = i.index_id
ORDER BY 1,2

Open in new window


That DMV (Dynamic Management View), despite its name, is not just indexes, and really it is a function, not just a view. It will show you a fragmentation percent, obviously higher the percent the more fragmented. But they are statistics and you need to apply your knowledge about the business to help ascertain the impact.

There are a couple of other interesting DMVs for indexes that can be used in conjunction to help clarify how data is being accessed. The main one is sys.dm_db_index_operational_stats. That is beyond the scope of this article. There is however a great article : Making-Sense-of-Index-DMV's-in-SQL-Server.

It is worth noting that the DMVs and Alter Index statements are now the preferred SQL methods rather than the DBCC commands.

Now, fixing fragmentation for indexes is pretty straight-forward. Simply (?) rebuilding or reorganizing the index will fix the problem (more later). Heap is a lot more difficult to fix, but fixable if it has to be... Best possible solution is to simply create a clustered index on either an existing identity, or add an identity column as your clustered primary key, which if you really need to can be dropped afterwards. Otherwise, you will need to get into manipulating or managing the data as a bulk process by either exporting, truncating and re-importing, or copy to a new table and rename etc. Like I said, fixing fragmented heap space can be more difficult.

But you really do need to ascertain not if there is fragmentation, but, if fragmentation is causing problems. You do need statistics on page life expectancy and buffer cache hit ratios as a starting point. Because an index rebuild on a large table can be a very intense and time consuming exercise and really does need careful planning.


Rebuilding (or Reorganising) an Index

So, what is rebuilding or reorganising an index. Well, we are told that the DBCC options are now deprecated and really need to use the newer functions like ALTER INDEX. The REBUILD is essentially recreating the index. If you are running the Enterprise Edition, you can also rebuild online using  WITH (ONLINE = {ON | OFF}), off=table is locked exclusively (the default), on=table can be shared, however, it uses more CPU and it generates a considerable number of locks. Sometimes better to consider a REBUILD as being an exclusive task. When rebuilding, you do need to start with clustered indexes, or, use the ALL option rather than specify each index name. REORGANIZE does not require exclusive locks as it is more simply reorganizing the leaf nodes of the index, and so, can be run online.

ALTER INDEX {<index name> | ALL} ON <table name> {REBUILD | REORGANIZE}

Open in new window


According to Books On Line, it is recommended that we consider rebuild when fragmentation is 30% or greater. In reality that is a very conservative number, and a more general consensus is indexes should be rebuilt when fragmentation is greater than 40%. So, that would mean indexes should be reorganized when fragmentation is less than 40%. In reality, I prefer the 80/20 rule. You will need to do something if fragmentation is above 20%. If is under 20%, then reorganize is the most likely option but only if you see performance being impaired.


Summary

Anyway whatever threshold you choose, and it may be different depending on individual tables, fixing fragmentation is an orchestrated set of events, and can (ideally) be automated, at least for the SQL parts...

Set optimal database size allocations (and locations) by file, do not shrink unless really needed, defrag disk once size has been set (or changed). Look at realistic autogrowth settings.
Run SQL backups - Run transaction log backup frequently to help reduce growth factors and release internal space within the transaction log disk allocation.
Keep "external" fragmentation down. Reorganise your database moving used pages to the front, releasing unused pages at the end.
Use statistics to help identify internal fragmentation and use REBUILD or REORGANIZE. Start with Clustered indexes first, then move on to non-clustered.
It can be more efficient to drop indexes and recreate. Beware of any check constraints and foreign keys.
Ascertain the thresholds where rebuild is required over reorganise.
Set up Maintenance plans to do all the above and have them run frequently enough to manage any possible fragmentation.
If you have a convenient time window (such as "we do not normally work on the weekend"), then use that time and only that time to do any rebuilding and reorganizing.
If you have periodic processes such as an end of month purge or archive, then schedule this type of activity via maintenance plans to follow the periodic processes.

The last point above is probably the most important to consider. Really what you want to do is to set up an automated maintenance task to run in a time window where you can look at restructures in accordance with business activity and processes (such as end of month purges / archiving historical data). Take snapshots of information - that is where you can gain a lot of insight into your business activity impact on a database.

You can then use and modify those maintenance plans to fix disk allocation, alert disk fragmentation, reorganise pages, rebuild indexes, update statistics in that sequence. After all that has been said, you might not need to fix every reported case of fragmentation, but if you have the window of opportunity, then you can also pretty much eradicate fragmentation from your database to the point where fragmentation is always being proactively managed.


Hope you found this article useful...

Special thanks to mwvisa1 as Page Editor and Friend for helping to present this Article in it's best light.
13
Comment
Author:Mark Wills
4 Comments
 
LVL 37

Expert Comment

by:ValentinoV
G'day Mark,

You've done a great job explaining the fragmentation problem, here's another YES vote!

A little remark/correction: DMV stands for Dynamic Management View, not Data MV...

Best Regards,
Valentino.
0
 
LVL 10

Expert Comment

by:lof
Hi Mark,

I need to admit you put me off with this 'accidental' in the subject. but then after your recommendation I have read it all and thoroughly enjoyed it. I try to remember it as it may save some time answering the questions ;)

Thanks
Lof
0
 
LVL 47

Expert Comment

by:David
One major thing you never wrote about .. in the real world people are going to be using a RAID subsystem and/or virtualization.  The RAID configuration, stripe size, chunk size, flush settings, block size, filesystem parameters, caching methodology / buffering settings, read load balancing  are all going to have a much greater effect on overall performance then fragmentation.

But my intention is not to diminish this great article, it is to warn them that you need to factor in the hardware config.  The SQL tools are blissfully ignorant and actually wrong when it pertains to fragmentation based on where the data is on the disk drives.  If you are running SQL inside a VM on a RAID subsystem then you are getting even less useful information, borderline useless if multiple VM share the same disk drives.

In any event, a great read and reference guide. Thank you for writing it.
0
 
LVL 51

Author Comment

by:Mark Wills
Hi David,

Can only agree... But writing about he RAID configuration, stripe size, chunk size, flush settings, block size, filesystem parameters, caching methodology / buffering settings, read load balancing  are all vitally important and would make this more like a mini-book than intended in this Article.

It is a great topic though and would encourage you to write that Article :)

I seem to recall an Article by dlethe, and was going to reach out quite a few years back to suggest writing such an Article....

Cheers,
Mark Wills
0

Featured Post

Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Join & Write a Comment

This video shows how to set up a shell script to accept a positional parameter when called, pass that to a SQL script, accept the output from the statement back and then manipulate it in the Shell.
This videos aims to give the viewer a basic demonstration of how a user can query current session information by using the SYS_CONTEXT function
Suggested Courses

Keep in touch with Experts Exchange

Tech news and trends delivered to your inbox every month