SQL server Storage system, Selecting the appropriate RAID level

A good server design is one that has no, or very few, single points of failure. Among the most common server components that fail are disks. The commonly used method for providing protection against disk failure is Redundant Array of Independent/Inexpensive Disks, or RAID.

In addition to providing tolerance against disk failure, certain RAID levels increase performance by striping data across multiple disks and therefore distributing I/O load among the disks in the RAID volume. After memory, the disk subsystem is the most important factor affecting SQL Server performance.

Capacity planning
One of the most important tasks is to correctly plan capacity of your server's disk arrays. Generally, disk arrays should have at least 20% free space. Although this will not provide significant performance gains, it can save from future bottlenecks. There are a number of reasons for this: NTFS file systems need some free space to perform optimally - if free space is not available, the NTFS file system is unable to function to its limits. Free space shortage can also lead to more fragmentation, which will slow down server reads and writes.

Nevertheless, in the event of sudden growth of database activity and thus database growth, a DBA will have more time to react and plan disk capacity upgrade.

Total Number of Physical Drives in Each Array
A disk array generally refers to two or more physical disk drives working together as a single unit. With the exception of mirrored arrays (which are two physical drives working together), the more physical drives that are in an array, the faster reads and writes are for that array.

Generally speaking, the more disks that are in an array, the more disk heads there are available to read and write data. SCSI drives, for example, have the ability to read and write data simultaneously. So the more physical drives that there are in an array, the faster data is read or written to the array. Each drive in the array shares part of the workload, and the more, the better. There are some limits to this, depending on the disk controller, but generally, more is better.

Characterizing Application Workloads
Designing a disk subsystem depends on the I/O requirements of the application it will support. The type of workload can be classified in two major categories: online transaction processing (OLTP) and online analytical processing (OLAP).

Online Transaction Processing (OLTP)
OLTP is an application that operates with a high level of create, read, update, and delete activities, and with a large number of user connections. It is typical of production servers to have heavy continuous updates accessed by many users through simple queries that generate random accesses to the disk.

Online Analytical Processing (OLAP)
OLAP is an application that operates with high levels of mostly read activity, with fewer user connections that make more complex and demanding queries. This technology is commonly used in businesses that require management-critical data that is accessed by fewer users for analytical investigation. Although OLAP is mostly involved with read activities, periodically there is write activity that has to finish in a timely manner. Decision Support System (DSS) is another term used for OLAP.

Different raid levels have their pros and cons for handling different workloads. To identify the most suitable configuration, application workload should be evaluated based on these criteria:

Number of random disk I/Os to the data files.

Number of sequential disk I/Os to the data files.

Number of transaction logs generated.

Rate of execution of the operations by users.

SQL server I/O characteristics

Operation  	    |Random/Sequential	|Read/Write	|Size Range				
                      --------------------|-------------------------------------------------------------------------
                      OLTP - Log 	    |Sequential       	|Write		|512 bytes - 64KB			
                      OLTP - Data	    |Random           	|Read/Write	|8K					
                      Bulk Insert	    |Sequential       	|Write		|8KB - 128KB (in multiples of 8KB)	
                      Backup 		    |Sequential       	|Read/Write	|1MB					
                      Restore		    |Sequential       	|Read/Write	|64KB

Open in new window

RAID 0

Reliability/Availability
Despite the name, RAID 0 actually provides no redundancy at all. It involves striping data across all the disks in the RAID array, which improves performance, but if any of the disks in the array fail, then the whole array fails. In that sense, RAID 0 actually increases the chance of failure. Consider RAID 0 as the zero redundancy RAID.

Performance
RAID 0 implements a striped disk array. The data is broken down into blocks and each block is written to a separate disk drive. Best performance is achieved when data is striped across multiple controllers, with only one drive per controller (duplexing).

Random Read - Excellent
Random Write - Excellent
Sequential Read - Excellent
Sequential Write - Excellent

Usage
RAID Level 0 requires a minimum of 2 drives to implement and provides 100% storage efficiency.
Some have suggested that RAID 0 may be acceptable for the tempdb database, given that tempdb starts out empty every time SQL Server is restarted and therefore redundancy of tempdb isn't really important. Although this is true, it's also true that a failure in any of the tempdb disks will cause SQL Server to fail, and you are then faced with rebuilding the disks before SQL Server can be restarted. For most sites, this would lead to an unacceptable outage.

A very good solution for non-mission critical applications, which are supported with a good backup recovery solution and have a very low rate of data changes. Also great for testing and development environments.

RAID 1

Reliability/Availability
RAID 1 is essentially disk mirroring. Each disk in a RAID 1 array has a mirror partner, and if one of the disks in a mirrored pair fails, then the other disk is still available and operations continue without any data loss.

Performance
For highest performance, the controller must be able to perform two concurrent separate reads per mirrored pair or two duplicate writes per mirrored pair.

Random Read - Fair (Worst of the RAID levels but better than a single drive)
Random Write - Fair (Worse than a single drive but better than some RAID levels)
Sequential Read - Fair (Comparable to a single drive)
Sequential Write - Good (Better than other RAID level)

Usage
RAID 1 requires a minimum of 2 drives to implement and provides 50% storage efficiency.
Useful for a variety of SQL Server components, including backups and transaction logs, RAID 1 arrays provide good read performance, and write performance suffers little or no overhead.

The downside to RAID 1 is the lower disk utilization. For every usable disk, two disks are required, resulting in a 50 percent utilization level.

Ideally, the operating system and SQL Server executables, including the operating system's swap file, should be located on a RAID 1 array. Some people locate the swap file on its own RAID 1 array, but I doubt that this really offers much of a performance boost because paging, on a well-configured server, should not be an issue.

Ideally, each separate transaction log should be located on its own RAID 1 array. This is because transactions logs are written to and read from sequentially, and by isolating them to their own array, sequential disk I/O won't be mixed with slower random disk I/O, and performance is boosted.
If databases are very small and all databases can be hosted on single drive, RAID 1 could be used to place database files on it.

RAID 5

Reliability/Availability
RAID 5 addresses the low disk utilization inherent with RAID 1 by using parity to provide redundancy rather than storing a duplicate copy of the data on another disk. When a disk failure occurs in a RAID 5 array, the data stored on that disk is dynamically recovered using the parity information on the remaining disks.

Performance
Although this is the most popular type of RAID storage, it is also not the best option for optimum SQL Server I/O performance. If a database experiences more than 10% writes, and most OLTP databases do, write performance will suffer, hurting the overall I/O performance of your SQL Server. In the event of a disk failure, read performance is also degraded significantly.

Random Read - Excellent
Random Write - Fair (Generally better with larger stripe sizes)
Sequential Read - Very good (Generally, better with smaller stripe sizes)
Sequential Write - Fair

Usage
RAID 5 requires at least three disks. Disk utilization in RAID 5 is calculated as # of drives-1/# of drives. For three disk volumes, the utilization is 66 percent, for five disk volumes, 80 percent, and so forth.
RAID 5's main advantage is higher disk utilization than RAID 1, and therefore a lower overall storage cost; however, the downsides are significant. Each write to a RAID 5 array involves multiple disk operations for parity calculation and storage; therefore, the write performance is much lower than other RAID solutions.

RAID 5 is best used for read-only or mostly read-only databases.

RAID 10

Reliability/Availability
RAID 10 combines the best features of RAID 1 and 0, without any of the downsides of RAID 5. Also known as RAID 1+0, RAID 10 is the highest performance RAID option. RAID 10 offers the high-performance striping of RAID 0 with the fault tolerance of RAID 1's disk mirroring, but without any of the write overhead of RAID 5.

Performance
RAID 10 offers the best performance for SQL Server databases, although it is the most expensive RAID option. The more write intensive the database, the more important it is to use RAID 10.

Random Read - Excellent
Random Write - Very good
Sequential Read - Excellent
Sequential Write - Very good

Usage
The downside of RAID 10 is the cost. Requiring at least four disks, RAID 10 arrays benefit from lots of disks to stripe across, each of which requires a mirror partner. In large deployments, the cost of RAID 10 may be prohibitive for some organizations, with the money perhaps better spent on other infrastructure components.

RAID 10 offers the most advantages to SQL Server and, despite the cost, should be seriously considered for environments requiring both high performance and fault tolerance. RAID 10 arrays are also a good option for transaction logs, assuming they are dedicated to a single transaction log.

File location recommendations

Location of the Operating System

For best performance, operating system files should be on a disk array that does not include the SQL Server data files (MDFs or LDFs). In addition, they should be located on a disk array that supports either RAID 1, 5, or 10.

Location of SQL Server Executables

The location of the SQL Server executables (binaries), like the location of the operating system files, is not critical, as long as they are not located on the same array as the SQL Server data files. Generally, they would be placed on a RAID 1 mirrored array.

Location of Swap File

Assuming that your SQL Server is a dedicated SQL Server, and that SQL Server memory usage has been set to dynamic (the default), the swap file won't see a lot of activity. This is because SQL Server doesn't normally use it a lot. Generally RAID 1, RAID 5, or RAID 10 can be used. It is a better solution to have two swap files: one on C: drive and second on another drive.

Location of the tempdb Database

If the tempdb database is heavily used, consider moving it to an array of its own, either RAID 1 or RAID 10, to boost disk I/O performance. Avoid RAID 5 arrays as they can be slow when writing data, a common side-effect of using tempdb. If tempdb cannot be located on its own array, and you want to avoid locating it on the same array as your database files, consider locating it on the same drive as the operating system. This will help to reduce overall I/O contention and boost performance.

If your application uses the tempdb database a lot, and causes it to grow larger than its default size, you may want to permanently increase the default size of the tempdb file to a size closer to what is actually used by your application on a day-to-day basis. This is because every time the SQL Server service (mssqlserver) is restarted, the tempdb file is recreated to the default size. While the tempdb file can grow, it does take some resources to perform this task. By having the tempdb file at the correct size when SQL Server is restarted, you don't have to worry about the overhead of it growing during production.

Location of System Databases

The system databases (master, msdb, model) don't experience a lot of read and write activity, so locating them on the same array as your SQL Server data files is generally not a performance issue. The only exception might be for very large databases with hundreds or thousands of users. In this case, putting them on their own array can help boost overall I/O performance somewhat.

Location of User Databases

For best performance, user database files (MDFs) should be located on their own array (RAID 1, 5, or 10), separate from all other data files, including log files. If you have multiple large databases on the same SQL Server, consider locating each separate database file(s) on its own array for less I/O contention.

Location of Log Files

Ideally, each log file should reside on its own separate array (RAID 1 or 10, as RAID 5 will slow down transaction log writes). The reason for this is because most of the time, transaction logs experience sequential writes, and if the array can write the data sequentially (not having to interrupt itself to perform other reads and writes), then sequential writes are very fast. But if the array can't write sequentially because it has to perform other random reads and writes, sequential writes cannot be performed and performance suffers.

Of course, having a separate array for each log file is expensive, and often cannot be cost justified. At the very least though, locate all log files on an array (RAID 1 or RAID 10) other than the array used for database files. While sequential write performance won't be as good as if each log file had its own array, it is still much better than trying to contend for disk I/O with data files.

Additionally, there are various other less commonly used RAID levels, including: RAID 2, RAID 3, RAID 4, RAID 6, RAID 0+1, Raid 0+3 and 3+0, RAID 100 (RAID 1+0+0) and RAID 50 (RAID 5+0). More detailed information on these can be obtained from the following sources:

http://en.wikipedia.org/wiki/Non-standard_RAID_levels
http://en.wikipedia.org/wiki/Nested_RAID_levels

SQL server Storage system, Selecting the appropriate RAID level

Comments (1)