Community Pick: Many members of our community have endorsed this article.
Editor's Choice: This article has been selected by our editors as an exceptional contribution.

Storage Explored, Explained, and Exampled

Published:
Updated:

Introduction

Storage today is ever increasing and ever demanding in every organization.  Legal requirements put even more burden on us by requiring us to keep data for several years - in some cases for even decades, and this has to be stored on something.  Naturally, for everybody out there, the biggest factor driving storage purchases is cost.

This article is here to discuss many of the various storage technologies which exist today, the pros and cons of each and the things to consider - or not to consider - when choosing your next storage strategy.

This article references servers in most places, however the same technology can be used on Desktops and Laptops.

Disclaimer

All products and/or brand names mentioned in this article are used as examples of products, and does not in any way endorse said products, nor does it imply that they are products which will meet your requirements.  You should do your own product comparisons before making a decision on which product to purchase.

Terminology

This article will use terms which you may be unfamiliar with, so let's start off by discussing some of the terms we will be using inside this document.

Enclosure.  An enclosure is a computer component built specifically to house hard drives.  These are also known as Canisters by some vendors.  Enclosures can be as small as mobile phones - which will house a single laptop-size hard drive
Controller.  A controller is a piece of hardware which controls the hard drives - and provides the communications platform for Servers to be able to talk to the hard drives.  You will ALWAYS have to talk to a hard drive via a controller, and the controller is either part of the server (using what's known as a Host Bus Adapter), a part of the storage subsystem, or in the case of Fiber Channel, a combination of both.
RPM.  RPM standard for "Revolutions per Minute".  It is the speed in which platters inside a hard drive spin.  Generally speaking, the higher the RPM, the faster the performance of the hard drive.
Latency.  Latency refers to the delays between the controller and the disk, where the controller sends a command to the disk, and the disk actually performing this command.  The lower the latency, the faster the performance.  Slower RPM hard drives would therefore normally have a higher latency than faster RPM hard drives.
IOPS.  IOPS stands for "Input/Output Operations Per Second", and basically means the number of I/O Requests made by the filesystem to the underlying storage subsystem.  Storage subsytems will vary in possible numbers of IOPS depending on the interfaces used, latency and complexity of storage subsystem.  A "good" subsystem should be able to cope with 25000 sequential IOPS without breaking a sweat.
USB.  USB stands for "Universal Serial Bus".  It's one of the most common technologies used when connecting low cost storage, and hopefully, if you are reading this article, you're at the point of moving past USB storage.
SATA.  SATA stands for "Serial Attached ATA".  Generally speaking, SATA is slow to medium speed storage, and is usually not used for high demand storage platforms.  SATA generally comes in speeds ranging from as low as 4600 RPM up to 10000 RPM
SAS.  SAS stands for "Serial Attached SCSI".  SAS has become the defacto standard for storage subsystems and provides fast storage throughput due to its high RPM speeds of usually 10 to 15 thousand RPM
FC.  Fiber Channel is less commonly seen as a hard drive interface, but very commonly seen as a communications platform to communicate with enclosures.
Interface.  Hard drives will have a specific interface, which will define it's maximum possible throughput.  The above 3 items list common interfaces, but be aware that it is possible to have a drive presented by a different kind of technology.  For example, SATA hard drives can be presented via USB.  As another example, SAS Hard drives can be presented via Fiber Channel
Fabric.  The Fabric refers to the fiber channel going from the storage to the server, and the term is generally only used when you involve a Fiber Channel switch.
FCoE.  Fiber Channel over Ethernet is exactly what it says on the tin - it's Fiber Channel, which has been encapsulated within Ethernet packets so that you don't have to use fiber or a fiber channel switch - the latter being the most costly option
iSCSI.  Internet Small Computer System Interface - or iSCSI, is not an interface, but rather a method in which to be able to comunicate with storage subsystems.  While generally used in a low cost environment, it does provide very good throughput and performance, and should not be necessarily dismissed in environments where storage speed is crucial.
SSD.  SSD Stands for "Solid State Disk", which is basically a unit full of memory chips and no platters or other moving parts.  As a result of this, the latency is very low, and subsequently, the speed is extremely fast.
JBOD.  JBOD stands for "Just a Bunch of Disks".  You would only use this term when you buy an enclosure which has many disks, and you use each disk individually as a physical hard drive, rather than combining them into a RAID.
RAID.  RAID originally stood for "Redundant Array of Inexpensive Disks", which was a contradiction by itself since the first RAID systems used the most expensive disks on the planet.  The term has changed over time and is now known as "Redundant Array of Independent Disks", and one could argue that that term is also a contradiction.  RAID is explained further below in its own section.
Logical Disk/Logical Volume.  A logical disk is a disk which is presented which is a subset of a raid array.  For example, you can have a RAID array which contains 10 physical disks, and is split up into 4 logical disks.  Each logical disk will show up as one physical hard drive to the operating system to which it is presented.
Spindle.  A spindle is a collection of one or more disks which have to operate together to provide storage.  In a JBOD environment, each disk is a single spindle, so when writing to multiple disks at once, the performance is high as each disk works independantly from each other.  In a RAID environment, each array is a single spindle, because each disk inside the array has to work together to achieve the read/write operation.
Partition  Physical Disks and Logical disks have to be sectioned up and its parameters defined in order for an operating system to start using it.  This is necessary, because the operating system has no idea how you want to use the storage.  You could, for example, have a 100 GB logical Disk presented via iSCSI - which may have been part of a 500 GB array - but you want to segment it further so that the operating system uses 40% of it for the C: drive, and 60% for the D: drive.  You're probably wondering why somebody would want to do this, and a perfect example is if you wish to encrypt the D: drive, but not encrypt the C: drive.  By using partitions, you can create segments within a physical (or logical) disk and use them in different ways.
Filesystem.  A filesystem is a structure within a logical disk or physical disk which defines where file and folder content are stored.  It holds an index of each file and folder, and requests from the operating system or applications is made to the filesystem, which then goes to the appropriate locations and retrieves the data.  Think of it like a Librarian who has sole access to the library, and goes and fetches or deposits the book on your behalf.

RAID

RAID is pretty important when making your decision on storage, because it will affect the performance of your storage subsystem, and consequently, all IO requests by your servers.  While I won't discuss how RAID works in great detail (there is a link below if you want to find out more), I will cover the basics, and more importantly the different RAID levels which will drive your storage strategy.

RAID Basics

RAID works on the premise that a collection of disks can be made redundant by implementing a mathematical equation that can be used in the event of a disk failing, thus not losing an data.  This mathematical operation is a simple binary XOR operation, and the resulting value is known as a "Parity bit".  Should a disk fail, the parity bit is used to calculate what would have been on the failed disk, so your data is not all lost.  There are of course, a few exceptions to this rule, and you'll see them as we go through the different RAID levels.

Each Raid group you create is known as an "Array".  It is possible (moreso, very likely) to have multiple arrays inside a storage subsystem.  Each Array is also one spindle.

Each member inside the array should be of the same speed and size as each other.  If one drive is smaller than the rest, then the array can only be built up to the size of the smallest disk.  For example, if you have 2 x 147 GB hard drives, and 1 x 300 GB hard drive, and build it into the same Array, you can only get the effect of 3 X 147 GB hard drives.  Some proprietory systems use other methods to make the most use of disks in these kinds of situations, but it's not true RAID.

RAID Levels

RAID 0.  The zero kind of indicates that there is no redundancy, and it's actually the case.  RAID 0 is a method in which data is "striped" across disks, this providing faster performance, because the data being written is divided between all the disks in the array.  This is the fastest performing RAID level you can have, because there is no parity calculation involved - but it's also the most risky because if one disk fails, you lose the entire array.  Losing an array means you lose all your data.

RAID 1 is the other kind of RAID technology which doesn't use Parity.  Instead, it uses a technology known as "Mirroring".  Anything that is written to one drive is written to another drive at the same time.  Therefore, if one drive fails, the data is not lost because it exists on the other drive.  The downside to this is that you always have to have double the amount of disks to get the same capacity.  RAID 1 is typically used in servers, as the operating system disk.  If one disk fails, the operating system is unaffected, and the server continues to run without any downtime.  The system administrator cheerfully replaces the failed disk, and the disk controller rebuilds the new disk with the data from the other disk in the mirror, and everybody is happy again.

RAID 2-4 exists, but not used very much so I won't go into detail.  See Wikipedia article if you want to know more.

RAID 5 - The most common kind of array in most storage subsystems.  RAID 5 uses the Parity technology we briefly touched upon earlier.  RAID 5 is great because it provides high performance and redundancy, all into one array.  Theoretically speaking, the more disks you have in your array, the faster throughput you will get - but don't be fooled!.  Remember that each array is a single spindle, which means limited IO streams.  In a RAID 5 array with 5 disk members, you would get the capacity of 4 disks, because the 5th disk is used for parity (caveat - the parity bit is shared amongst all disks, not a single disk like it is in RAID 3).  Therefore, if you have 5 x 300 GB hard drives, you will get an effective 1.2 TB array (300GB x 4)

RAID 6 uses the Parity technology we briefly touched upon earlier, but with a twist.  Instead of the Parity bit residing on 1 disks, the parity bit will risde on 2!  Think of it like RAID 5, with extra protection, because that's exactly what it is.  In a RAID 6 array with 5 disk members, you would get the capacity of 3 disks, because the 4th and 5th disk is used for parity.  Therefore, if you have 5 x 300 GB hard drives, you will get an effective 900 GB array (300GB x 3)

Combining RAID Levels

It is common practice to combine RAID levels to provide additional benefits, and the common ones are shown below:


RAID 1+0 - also incorrectly referred to as RAID 10, is when you take a mirrored array (RAID 1), and then stripe it (RAID 0).  The benefits of this is that you are taking a redundant array (RAID 1), and making it faster by adding another redundant array, and then striping the data between the two.

RAID 0+1 is almost the same as RAID 1+0, except that you're taking a striped array, and then mirroring it to another striped array.  The performance difference between the two is identical, and you'd probably wonder why you'd do it like this, rather than the traditional 1+0.  The only time this would be applicable is if you have two enclosures - by striping the first enclosure, you can mirror it to the second - that way if either enclosure fails, you still have access to your data.  If you used RAID 1+0, it's likely that you would lose your array in the event of an enclosure failure.

RAID 5+0 is taking two RAID 5 arrays of equal size, and then mirroring it.  Superfast throughput, but you are limiting the amount of IOPS because you're increasing the Spindle size.

RAID 6+0 is taking two RAID 6 arrays of equal size, and then mirroring it.

While it is theoretically possible to Stripe a raid 1+0 set (thus giving you RAID 1+0+0), it's very hard to do, and pretty much requires midrange or better storage subysstems.

RAID IS NOT BACKUP

RAID is not backup.  For the love of all that is holy, please do not use RAID as a backup strategy, because IT IS NOT.  RAID is designed to provide redundancy and resilience when a single drive (or in some cases, 2 drives) fails.  It will not protect against catastrophic failures (such as power surges blowing out all of the drives in the array), and it will also not protect against user error (users deleting their data by accident).

RAID is NOT Backup - always have a backup strategy in addition to your storage strategy.

Storage Subsystems

This section will cover the possible methods in which your servers will attach to your storage.

DAS (Direct Attached Storage)

Direct attached storage is storage which is attached directly to a server or workstation, and the computer operating system has direct control with the hardware of the storage.  Common DAS solutions include:

* USB Pen Drive (also known as a Memory Stick)
* USB Disk Drive (one or more hard drives inside a casing, presented via USB)
* e-Sata Disk Drive (a hard drive inside a casing, presented via e-sata interface)
* Firewire Disk Drive (a hard drive inside a casing, presented via firewire interface)
* SAS Disk Array (An array of disks presented via SAS either as JBOD or RAID)

The last one listed is interesting - it is indeed a whole bunch of disks in an enclosure, and it's connected directly to a server via SAS.  The difference between this and a proper SAN (which we'll get to shortly) is that this - and every other DAS device - does not have its own controller.  The controller used would be the one in the server itself.

NAS (Network Attached Storage)

Network Attached storage is quite simply, a self-contained box which provides storage connectivity to clients via the Network.  NAS's contain their own operating system and own filesystem, and clients retrieve data using a network share.  A NAS can also be a server which has a bucketload of storage attached to it, and makes this storage available to clients.  In fact, a good, cheap solution for huge amounts of storage is to get a server which has 24 hard drive slots, configure it to use hardware RAID, and then create network shares for other servers to connect to.

SAN (Storage area Network)

A Storage Area Network involves any Storage subsystem which presents storage to servers as physical storage - and not as a filesystem.  SAN's are therefore either FC, FCoE, or iSCSI.  It is then up to the server to allocate the storage how it sees fit, and create its own filesystems thereupon.  SAN's offer much greater flexibility over NAS's, but do come with more networking demands - for example, a Fiber Channel SAN will require a Fabric installation, and will almost always include two or more SAN Switches (2 for redundancy).  High performing iSCSI SAN's will probably be using 10 Gigabit Ethernet ports, and you WILL have to invest in 10 Gigabit ethernet switches as a result.  SAN's also generally perform a lot better than NAS's because there is no overhead of a filesystem in the back-end, since each server maintains its own filesystem.

Differences between accessing storage via a NAS, DAS or SAN


As previously mentioned, but not necessarily understood - a NAS, DAS and SAN present storage in different ways, so we're going to have a look at this and see what the differences are.

A NAS has full control over its own disks, and implements its own filesystem.  There are many different kinds of filesystems, the most popular ones as follows:

FAT  File Allocation Table - is a very basic and limited file system which is amazingly, still in use today.  Many USB pen drives will come preformatted with FAT.  FAT comes in varying flavours, with FAT, FAT32 and FATx being the most commonly used today.  FAT is also supported by Linux, Unix, Windows and Macintosh based operating systems, and therefore the most widely supported file system there is.

EXT  Extended Filesystem - is primarily a Linux filesystem based on a indexing technology known as Journaling, which helps improve data loss because each write transaction is logged.  In the event of a disk failure, the filesystem can check whether the logged data was in fact written, and ensure that the file system is in a healthy state.  The Extended Filesystem has undergone various generations, and you may see ext2, ext3 and ext4 being used in some places.  ext3 and ext4 are in fact the most popular filesystems for most NAS devices I've seen.

NTFS  New Technology File System - is a proprietory filesystem used by Microsoft operating systems which also uses a journaling system. NTFS provides a lot of features which are part of the filesystem, including compression, encryption and most importantly, Access Control Lists which provide quite an extended subset of permissions for both files and folders.

In order to access these filesystems, servers and workstations (referred to as Clients) have to connect to the storage using a network share.  The network share is generally one of the following two:

SMB/CIFS  Server Message Block - aslo known as Common Internet File System is the standard file sharing protocol used by Windows operating systems.  Many Linux distributions include a SMB capable server (usually, Samba) and because the operating system is free, and the protocol is widely documented - it's a popular platform for NAS devices to use.

NFS  Network File System - Developed by Sun Microsystems - is a very popular Unix based file sharing solution.  This is less commonly seen, although Windows clients do support connecting to NFS shares.

Because NAS has its own filesystem, it means that whatever servers connect to it, are forced to comply by the filesystem being used.  This also unfortunately means that if the filesystem being used is EXT - for example - then the clients connecting to it do not get all the benefits of NTFS.

Additionally, all clients must therefore work at a file level, rather than at a hardware level, which is why it makes it completely unsuitable for high workloads.

A DAS and a SAN however present their storage to servers at a physical layer - which means that the server has full and unconditional access to the storage, and can do with it what it likes.  Servers must therefore create its own partitions and implement its own filesystems to be able to use the storage

Different partition types are available to servers, and this can get quite complex, but I'll just tell you about three that you would most likely encounter.

MBR stands for Master Boot Record.  It's one of the oldest and most widely used methods to segment a disk.  Traditionally, only 4 partitions can exist within a disk which has been initialized as MBR, but thanks to funky trickery, you can turn one of those partitions into what's known as an EBR (Extended Boot Record) which allows you to have more partitions.  EBR is no longer a preferred method to have many partitions - since GPT came along.

GPT is GUID Partition Table (GUID being "globally unique identifier"), where you can have a large number of partitions, and extremely large partitions too.  GPT removes a lot of the limitations that MBR had.  You should however be aware that some servers do not support GPT disks as their boot disks, so for safety sake, you should keep your operating disk as MBR.  Once the operating system has booted, and if you have GPT disks, then the GPT disks will be able to be used

Dynamic Disks is not a partition type per se, but it's a proprietory Microsoft technology which extends the usability of MBR and GPT disks.  It's used to provide software redundancy features such as mirroring, striping and parity.  It also gets around the limitations that MBR impose, by layering a new storage structure on top of the existing one.  Use Dynamic Disk with caution!  If you run into problems, and your disks become unuseable, recovery will be very tricky since Dynamic Disks are only supported within Windows operating systems.

Examples of Storage Strategies

There are many questions one needs to answer before one can decide on a storage strategy, for example:

Cost per gigabyte
Performance (MB/sec)
Performance (IO/sec)
Flexibility
Reliability
Total Cost of Ownership (TCO)

Those are the obvious questions, but what's no so obvious are the questions which follow:

What kind of uptime is required - in a 24/7 environment you get no little or no opportunity for maintenance
How much will it cost us if the storage becomes unavailable
How long will it take to back everything up
What extra features do we need to consider - volume snapshots?  Encryption?  Spanning across Datacenters?
Can I grow my storage

Whenever you are considering a storage subsystem, always think ahead - don't get something that will suit your needs right now, but get something that will suit your needs in 5 years time, because that's how long you are going to want your storage to last.

With the above in mind, let's go ahead and discuss possible scenarios, and show the pros and cons of each:

Low Cost, High Storage Capacity

A Simple Low Cost, High Storage Capacity will more than likely steer you down the route of getting a NAS.  The more popular NAS solutions will feature 4 or more drive bays, and if you consider that SATA hard drives are at a whopping 4 TB right now, you can easily get 12 TB of useable space in a 4 drive NAS.  The NAS I have at home is 10 drives, and it's even expandable by another 5, thus giving me a total number of drives of 15.

Of course, keeping in mind that at any time, one of those drives would fail, so to be on the safe side I would create 3 x RAID 5 arrays of 5 drives each, giving me a total capacity (assuming 4 TB SATA hard drives) of 48 TB's.  If you didn't do this and used all 15 drives in 1 RAID 5 array, you would get 56 TB's of useable space.

That's certainly a lot of data.

Pros:

* Low overall running costs
* Requires no server
* Low noise, most inaudible
* No cooling requirements

Cons:

* Fast throughput, but low IOPS.
* Little or No control over Filesystem
* File or Directory based permissions very limited
* Not suitable for Exchange or SQL
* Not suitable for Virtualization or Clustering

Example Product: Synology DS 1512+ with 2 optional expansion units

Low Cost, High Storage Capacity, High Usability

One of the most popular trends for high capacity and high performance storage is to implement servers which already have high capacity storage capabilities - either as part of its own server enclosure, or as expansion enclosures.  Consider if you will the following scenario:

1U server with SAS Raid Controller
4U Storage Chassis support up to 45x HDD

Alternatively, you could combine the two into one as follows:

4U server with 36x HDD support

As I've already stated before, we wouldn't want to put all of our eggs in one basket, or more importantly, put all of our disks in one spindle, so we would opt for the following configuration (assuming you go for the single unit which supports 36 HDD's):

2 x disks RAID 1 for the Server's operating system
8 x disks in RAID 5 - Array 1 (one spindle)
8 x disks in RAID 5 - Array 2 (one spindle)
8 x disks in RAID 5 - Array 3 (one spindle)
8 x disks in RAID 5 - Array 4 (one spindle)
2 x disks as Hot Spare

The Hot Spares are there as standby, so that if one of the drives in any of the arrays fail, the hot spare takes over that drives function until the failed drive is replaced.  Depending on your RAID controller, this should happen automatically.

Assuming you go for 4 TB hard drives in your RAID 5 arrays, you'll end up with 112 TB storage space in 4 U's of space.

Pros:

* Single unit which fulfills server and storage requirements
* Physical Server can host virtual servers, this reducing your TCO.
* Able to share storage with other clients
* Full control over filesystems
* Full control over file and folder permissions
* Can present storage to other servers using iSCSI
* Can provide SMB 3 Filesystem for Hyper-V 2012 shared storage
* Fast Speed for MB/sec
* Good IOPS (but not necessarily good enough to host 50 virtual machines at once)
* 4 separate spindles means you can give dedicated performance to high demand applications such as SQL Server
* Low cost to implement and maintain.
* Redundant power supplies helps improve resilience

Cons:

* Extremely noisy, not something you'd want to keep under your desk.
* Needs good environment to operate - preferably cool and dust free.
* Single point of failure - If the server crashes, all access to storage is lost.
* Storage is hostage to OS.  Whenever the OS wants to reboot due to updates or other maintenance - the storage reboots with it.

Sample Product: Supermicro SC847E26-R1K28LPB or similar model.

High Performance, High Resilience

For many companies, a NAS and a DAS just won't do it from a performance perspective, or more importantly, a reliability perspective.  This is when we start separating Operating system from Storage, and have the storage work independently from any traditional operating system.

This is when we start moving into the SAN world.

SAN's typically come with lower capacity storage units than a DAS could, and in order to get the same quantity of storage as a DAS could offer, you are going to have to buy multiple enclosures.  SANs also won't allow you to put in any old hard drive you can get your hands on - all hard drives have to be vetted by the SAN manufacturer, and usually bought through the SAN manufacturer.  You can get some low-cost SAN manufacturers that don't provide hard drives, allowing you to use anything you like, but they do maintain a HCL (Hardware Compatibility List) to show which hard drives have been tested and approved.

For example, if you buy a Fujitsu SAN, you are going to have to buy the Fujitsu hard drives that are compatible with the model of SAN; otherwise you could void your warranty on the whole SAN itself.  If you buy an Infortrend SAN, you have no choice but to buy 3rd party hard drives, as Infortrend only sell SANs, and not SANs with hard drives in them.

SANs also typically come with two different size hard drives - 2.5" hard drives and 3.5" hard drives.  2.5" hard drives are typically low capacity, high RPM drives, and the 3.5" hard drives are typically higher capacity, lower RPM drives.  Depending on your needs, you may end up with two enclosures, one enclosure featuring 2.5" hard drives, and the second featuring 3.5" drives.  Most organizations would "tier" their storage in this fashion so that fast storage be given to high demand storage (Virtual Servers, SQL Databases), and slower storage be given to low demand storage (User Data, Exchange Data).

SANs also typically have Redundant Power Supplies, Redundant Controllers and even Redundant Paths to its storage.  The most cost effective SAN is going to be one based on 1 Gigabit iSCSI, because it uses standard Cat 5e (Preferably Cat 6) cabling.  Please be aware that many "low-end" SANs do not come with the ability to add expansion enclosures, so expanding your SAN would mean buying an additional SAN.  Also, not all SANs will automatically come with redundant controllers.

Our sample model has 24 x 600 GB hard drives running at 15000 RPM each.  Again, we wouldn't configure these all into 1 array, and of course, we want to leave one or two hard drives free as hot spares.  Therefore, we're going to go with the following configuration:

2 x HDDs as hot Spare
2 x RAID 5 arrays with 6 disks in each (6000 GB total)
2 x RAID 5 arrays with 5 disks in each (4800 GB total)

We could of course do the following to give us much faster througput, at the risk of losing the ability to share the storage with a few hundred servers:

2 x HDDs as hot Spare
1 x RAID 5+0 Array with 12 disks (6000 GB)
1 x RAID 5+0 Array with 10 disks (4800 GB)

You could also of course, not opt to have any hot spare disks and end up with 4 arrays of equal size.  If you do this, I would strongly recommend that you keep a hard drive on site in case one of the drives fail.  If you encounter a failed drive, replace the failed drive with the spare on-site drive, and the array should rebuild.  Having a hot spare is much more preferred because if one drive fails, it will automatically assign the hot spare and use it, whereas manual intervention won't be as fast.

Pros:

* Extremely high performance SAN for both Throughput and IOPS
* Resilient and can cope with various kinds of hardware failure
* You can upgrade controller software without downtime (dual controllers required)
* iSCSI standard allows you to connect any server to it
* Can boot diskless servers directly from SAN (may need iSCSI friendly network controllers)

Cons:

* Needs Cooled room to keep within operational standards
* Expensive to expand - in some cases expansion is not possible
* Limited to network throughput.

Sample Product: Fujitsu Eternus DX60 S2 2.5" model with Dual Controller

High Performance, High Resillience, High Capacity, High Flexibility


It's not unusual to have a need for some of the highest data transfer rates possible when it comes to storage, and also need a high capacity.  In order to achieve this however, you are going to have to implement either multiple SANs, or a multi-purpose SAN.  Fortunately, many of the high end SANs are already multi-purpose, and you buy it and build it up as your demand grows.

This is where we consider going fiber - either using Fiber Channel (8 Gbps) or iSCSI (10 Gbps).  Depending on the kind of fiber used, we can also stretch the datacenter across multiple locations, so that high availability becomes a reality.

Now, don't let the fact that iSCSI can be 10 Gbps fool you into thinking that it's faster than 8 Gbps Fiber Channel - because it isn't.  Fiber Channel still well outperforms iSCSI because with Fiber Channel, the payload size can be up to 128 Megabytes per frame.  iSCSI - because it's encapsulated within TCP/IP packets - has a much smaller payload size, so a lot more packets have to be transmitted to equal the same data as Fiber Channel.

Our ultimate SAN Build is therefore going to be 6 Enclosures, with the first two enclosures hosting 24 x 2.5" hard drives, and the other 4 enclosures hosting 16 x 3.5" hard drives.  The controllers are going to be 8 Gbps Fiber Channel, with 4 ports in each controller, and we're going to have two SAN switches.  Each Server is going to have two Fiber Channel interfaces, one for each switch

Maximum Possible Space:

Tier 1 (Fast) Storage - 24 x 600 GB HDD's per enclosure (28.8 TB, minus parity)
Tier 2 (Slower) Storage - 12 x 3 TB HDD's per enclosure (144 TB, minus parity)

Pros:

* Expandable even further if necessary
* Superfast - can cope with 200,000-250,000 IOPS
* Will accommodate data transfers of around 800 Megabytes per second per physical server
* Will accommodate thousands of concurrent server connections (virtual or physical)
* Highly resilient, can cope with power supply failure, controller failure, disk failure, channel failure
* Highly available - downtime generally not required, even to upgrade firmware of controllers (since it will do one controller at a time)

Cons:

* Not for people with a low budget
* Additional License fees may apply for extra growth or extra features

Sample products: IBM System Storage DS5000, or IBM Storwize v7000

Technologies to help stretch your budget

There are various technologies available to stretch your budget even more, but be aware that these technologies come with a cost - not only financially, but performance too.  Not all hardware will build these in, but you'll be pleased to know that software solutions are also available.

Thin Provisioning

Thin provisioning is the method in which a storage device only uses the amount of storage you're actually using, rather than the amount of storage you've allocated.  As an example, you can have a 200 GB Array, and create a logical volume of 100 GB, thin provisioned.  The server will think it's got 100 GB, and be able to use it all, but the storage will only actually allocate the storage used, rather than the full amount.  Assume the server only uses 10 GB of this - the storage will therefore show 190 GB free for allocation of another logical volume.

Typically, this is always a useful feature, because you can give people the disk space they need (even though they don't need it), and it will only use the space they actually use.  I personally use this technology because that way we can oversell storage space to customers, and expand as the demand on our storage subsystem grows.

Thin Provisioning is generally available at a storage level, and at an OS level.  In fact, Hyper-V dynamic disks is a prime example of thin provisioning.

Pros:

* Can save money on your initial storage bill
* Only takes up storage space used.

Cons:

* Can overcommit storage and end up with a situation where you actually run out of storage
* All data written will generally become fragmented across the array

Data Deduplication

This brilliant technology is extremely useful in reducing storage requirements, because what it does is free up blocks of storage which is identical to each other.  Assume for a moment that you are running 200 virtual servers, all installed with the same operating system (either sysprepped, or installed using same script), then it would be safe to assume that those 200 virtual servers will have the same data in the same blocks on their respective virtual hard drives.  Data deduplication will figure this out, and remove the duplicate blocks, freeing up those spaces.  It will create a pointer so that whenever a read is requested to the removed block, it redirects the read to the "master copy".  If at any point, any redirected block is being written to, it will write a new block out, thus not destroying the master copy.

Data deduplication is availalbe on many high-end storage devices.  For those people who don't have this possibility, you'll be pleased to know that Windows 2012 now introduces data deduplication for any storage it manages.

Pros:

* Saves storage space by removing duplicate blocks

Cons:

* Could overcommit by accident
* Initial Data Deduplication process may slow storage down

Compression

Compression isn't new - it's been around for a very long time, it's just that it's not really easy to work with having to compress and decomrpess everything every time you need it.  Compression however has been available for Windows Servers for a long time, and it works well.  Consider using it if you can.  Compression is however becoming a more common feature on storage hardware, and as an example, IBM Storwize v7000 includes it by default (although you have to pay for a license).  Compression at block level on a storage device is a reality, and does work.  Depending on the kind of data you have on your storage, you could save quite a lot of space.  As an example, a 1.5 TB volume which contains Hyper-V Guest operating systems, I saved 600 GB by turning compression on.  Unfortunately - on the fly compression does come with a horrible price - performance.  If you're going to use on the fly compression at block level, be sure that your array configuration is configured in such a way that compression on one logical volume does not impact the performance of other logical volumes on the same array.

Compression is not recommended for high IO demands - so don't turn on block level compression for SQL databases!

Pros:
* Save space!  Compression can give you 20-70% space back

Cons:
* Performance hit, especially during compression
* Again, can overcommit and suddenly find yourself out of space

A few "gotchas"


As you can see, there is a lot to consider when it comes to buying storage, and there are many options available to one to reduce the cost of storage.  Here is a few "gotchas" - mistakes or oversights that we could easily make.

Hot Spares

Having a hot spare in your array is always recommended.  If you don't want to take up valuable space in your array for a hot spare, then ensure you have a disk on site that you can use to replace a failed disk.  The chances of a second disk in an array failing at the same time as one disk is low - but not impossible.  The sooner you get your array back to a fully operational state, the better.

Thin provisioning within thin provisioning

One of the biggest problems with using Dynamic disks in Hyper-V is that the disks become fragmented.  This is the price we have to pay for using Dynamic Disks, but fortunately, it's not a huge price.  Be wary though on using Dynamic disks on volumes which are already thin provisioned, because if you do, you're just increasing the fragmentation, and performance may become an issue.

In-place expansion

It's very possible to end up with a storage platform that cannot be expanded, because it has no ability to expand to additional chassis.  Thankfully however, many storage platforms support in-place expansions, allowing you to switch out smaller hard drives for larger ones.  Effectively, what you are doing is breaking an array, and letting it rebuild on larger disks.  Once all the disks have been replaced, you can extend the array to use the new free space.

If you do this however, be sure that you are using RAID 5 or better, and do 1 drive at a time and ensure that the array is in a consistent state before moving on to the next disk!

SAS and SATA in same chassis

You may have heard that you can put SATA hard drives in a chassis which supports SAS.  This is indeed true, but what you should NOT do is interchange SAS and SATA in the same chain.  For example, if you have two enclosures, linked via a SAS cable, one enclosure with 24 x 600 GB SAS Hard drives, and the second enclosure with 12 x free bays in which you could fit in either SAS or SATA hard drives - be sure to use SAS.  Most high end storage platforms will not allow mixed platforms.

Huge Arrays

Don't be tempted to make a huge array.  While technically possible to create a single array containing 24 disks, this would be utterly foolish for the following reasons:

* Rebuild of the array would take an extremely long time.  If you have a failed disk, and need to replace it, it could take weeks to rebuild the array, thus increasing the window in which another disk can fail
* Single spindle - all 24 disks have to operate simultaneously in order to do a read/write operation, so if you have many devices trying to talk to the storage at once, you will see some contention

Try to make arrays no larger than 13 drives.

Dynamic Disks/Software RAID

Some applications require the use of Dynamic Disks - for example, Microsoft's Data Protection Manager.  In cases like this you have no choice but to use Dynamic Disks.  As previously mentioned, Dynamic Disks also offer benefits such as mirroring, parity and stirping - the same kinds of technology used in RAID systems - but please be aware that this is software driven, and there is no dedicated hardware to do the calculations necessary to provide optimal performance.  While Servers are getting faster each month, and CPU speeds ever increasing, don't be tempted to use Software RAID.  Always use Hardware raid wherever possible.

Never use Dynamic Disks inside Hyper-V - this is not supported and you will run into issues.

Conclusion

Hopefully by now you will notice that Storage isn't quite a simple as apple pie, and that there is a lot to consider when choosing a strategy.  I can't tell you exactly what's good for you, because each person's needs are different, but hopefully you know which questions to ask yourself and your storage vendor when potential products are being reviewed.  If at all possible, discuss your needs with a storage expert beforehand, where you will be given unbiased advise - because salespeople will try to sell you whatever they can.

If you don't know what to buy - ask Experts Exchange.  We'll help!

Further Reading

The following articles may be of interest for further reading:

RAID (Wikipedia)
Basics of Storage IOPS on RAID
List of storage transfer rates (Wikipedia)
File System (Wikipedia)
Master Boot Record (Wikipedia)
GUID Partition Table (Wikipedia)
9
6,120 Views

Comments (5)

CERTIFIED EXPERT

Commented:
Very very informative.

Thanks for writing this wonderful article.

The quality of writing / material should put your article under Editor's Choice or at least must  be EE-Approved. Let the Page Editors decide.

Ravi.

Author

Commented:
Thanks for the kind comment :)  I think for one of those categories it requires at least 10 likes, so thanks to you and 2 others we're 30% of the way there :D

I am in the process of writing another article right now - and your additional feedback has encouraged me even more to make it as good as I can.  I expect the new article to be available early next week.

Lester
CERTIFIED EXPERT

Commented:
Its strictly the quality of the article that gives these flags. I've written a couple which were EE-Approved without any yes votes.

See-
https://www.experts-exchange.com/A_10591.html
https://www.experts-exchange.com/A_10798.html
https://www.experts-exchange.com/A_10800.html

Yep, go through them, if you have an android device.

Ravi.

Author

Commented:
Thanks very much - my second E-E article :)

Lester
Distinguished Expert 2022

Commented:
Your "Huge Arrays" section is partly wrong, in a SAN you often have arrays with dozens of disks in them using RAID levels 10, 50 or 60. Some you can chose how many disks there are in the RAID 5/6 set before striping them, others set the parity group size automatically.

All the disks in the array are not used for a read or write operation. With RAID 5 two disks are used for a write - the data and parity disks are read, this is XORed with the data and then the data and parity disks are written again, so it's 4 physical IOPS for a RAID 5 write. The controller's battery backed cache will improve this, it will cache the writes and may be able to avoid a lot of the reads, especially with sequential data.

The advantage of very large arrays is you add the IOPS of all the disks together and since there may be multiple LUNs on the disk set when one server isn't using any IOPS there's more available to the others.

SAS and SATA in the same enclosure is also very common, you just can't mix them in the same array or disk group. There's nothing wrong with it since although the SAS expander in the enclosure talks SATA to the disk it talks SAS upstream to the controller.

Have a question about something in this article? You can receive help directly from the article author. Sign up for a free trial to get started.