Link to home
Start Free TrialLog in
Avatar of Zac123
Zac123Flag for United Kingdom of Great Britain and Northern Ireland

asked on

hot swap SATA - designing a cheap storage system

hi all,

i'm designing (on paper) a cheap iscsi storage array.

i'm wondering if its possible hot swap a SATA HD from a raid 5 array without causing major probs. there is no OS on the disks its only mass storrage.

zac
ASKER CERTIFIED SOLUTION
Avatar of CompProbSolv
CompProbSolv
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of Zac123

ASKER

>provided that the other drives experience NO errors while the array is rebuilding.

thanks. what kind of measures could i put in place to avoid this?
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of Zac123

ASKER

yes good point. R6 has higher fault tolerance and requires a minimum a 4 disks. I just need to check if my software will support raid 6!

Check your controller if you can add a "hot spare" to your array. So let's say you have raid5, 3 disks. You add a 4th disk and configure it to be a hot spare.
If one of the 3 array disks fail, the controller will immediately start rebuilding to the spare disk.
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
You're going to make a software raid5/6 for iscsi ?
Bye performance ...
Avatar of Zac123

ASKER

no I shouldn’t have said software, i should have been more ambiguous and said 'System'
Back when CPU speed was reported in MHz, then software-based RAID6 would have been an issue.   With modern CPUs the overhead is a non-issue.  I have customers using Sun solaris SOFTWARE-based RAID and iSCSI that average 200TB worth of disk drives using RAIDZ2 which is equivalent to RAID6 on solaris.  All "software-based" RAID.    

Performance issue?  No way.  You have cloud companies doing this all day long providing petabytes of storage all on software-based RAID.   Google?  Software-based RAID and solid-state disks.  Ebay?  Software-based RAID.

Now you DO need to get enterprise class disks, not those cheap $69 consumer desktop drives.  (But actually google uses those drives, but they have a heck of a lot of redundancy).

Personally, all you need to do is run solaris and the native zfs which supports iSCSI targets  You can do online hot expansion, change RAID levels, add disks of ANY size,  throw in a SSD if you want extra performance at any time into a HOT iSCSI drive.  Hot backup, snapshot, etc....   The solaris O/S with integrated zfs file system is architected for such things.  That is why Oracle bought Sun.
Avatar of Zac123

ASKER

thanks dlethe,

i'm googling right now...
<Soapbox alert>
P.S.  many of those NAS appliances that scale to 5 or more external enclosures are based on nothing more than a hardened version of LINUX and their standard md RAID driver.   I know, my company writes the management software.     This is inherently more efficient and faster than an internal RAID card because O/S-based RAID knows much more about what is in the I/O queue and can use more RAM to cache I/Os.   You actually go to the disk drive much less to read/write data when your host operating system handles the RAID.

The days of custom RAID controller ASICs that do the XOR calculations are gone.  Controller-based RAID is now an embedded O/S running "software" RAID anyway.  Only these external systems don't have intimate knowledge of what is going on in the host O/S so they have to go to the platters much more often.

You're talking milliseconds using PCI-based controllers verses microseconds or even nanoseconds to get data with zfs, md-raid, or windows dynamic disks.  Plus these mechanisms typically do better job of load balancing provided you set up multipathing.

The fastest I/O is the one you DON'T have to do.  PCI-based RAID controllers have MB of cache, operating systems have GB of cache, and know much more about I/O usage and patterns and can have a greater queue depth than controller-based RAID.  It is impossible for a controller to compete on modern CPUs.   I also wrote RAID controller firmware so could show you the math, but you could probably find it online that supports it.  (OK, off soapbox)
Plus ZFS has additional data integrity features.  It always flushes writes to disk so when you write data, it is instantly on the HDDs. It creates additional checksum information, and has variable block size.  The more RAM you throw into the O/S, the more I/O cache you get.  The system RAM is an extension of the controller, so to speak.  Built-in data compression also, and depending on the distribution, built-in dedup (de-duplication).    You can even buy off-the-shelf appliance products based on this, like nexenta, if you want some handholding and/or a simple bootable disk that sets it all up for you.  
@dlethe
Ok, got the point.
But: I've experienced lots of problems using Windows software raid1: after a bsod, disks weren't synced anymore, was constantly resyncing and that's a performance issue.
Added a PCI-X sata raid controller, 3x1TB raid5, no problems at all.
Had a RHEL database & file backup server, software raid1, performance problems. With a controller there weren't any problems.
I don't have experience with Solaris nor zfs.
Maybe we can say it depends on the case which is best?
Avatar of Zac123

ASKER

ok thanks everyone for their input. i have to refer back to original question when awarding points.

thanks again
zac
First windows software RAID is not even in the same class as solaris or linux software RAID.  Furthermore, I submit you did not have a problem with software RAID as much as you had a problem with the wrong kind of disks.   Consumer class disk drives are simply not architectured to be part of a RAIDset.   The firmware is not tuned for this type of use.    If you did not have enterprise/server class disks with the TLER feature, then you were destined to have problems.