Link to home
Start Free TrialLog in
Avatar of msidnam
msidnamFlag for United States of America

asked on

RAID 5 vs RAID 6 using 8 600GB 15K disks

I am going to be installing Sun Solaris 11 and then an application that has a proprietary database. I currently have 8 600GB 15K disks in the server. I used to do a RAID 1 for the OS and then RAID 5 or the data. I would like to be able to make one big virtual disk using RAID 5 or 6. However i don't know if using all 8 disk will hinder performance.

For the database that will be installed I think we do more reads than writes but we do a ton of writes all day long too.
Avatar of David
David
Flag of United States of America image

YUK - no you do RAIDZ2 with Solaris/ZFS, not RAID6.    Then if you want, you can do a RAID1 on two of the drives. So with 8 disks, a 2-disk RAID1 (mirror) and a 6-disk RAIDZ2 would be ideal.
Avatar of msidnam

ASKER

I don't get an option for RAIDZ2 on the controller options
Never, never, never use hardware RAID controllers with solaris.  You lose all the speed, data integrity, de-duplication, hot snapshots, and flexibility inherent to  ZFS. The entire operating system is designed to outperform with software RAID. Why do you think Oracle, of all people wanted Solaris?  That should tell you something on how well the O/S does with databases if you give it the chance.

Read up on ZFS.
Avatar of Member_2_231077
Member_2_231077

What server/controller do you have?
Avatar of msidnam

ASKER

I have an X3-2L server but i am not sure what controller. I think it says megaraid but that could just be a generic name when it boots. tomorrow when i am in the office i will check.

I will read up on it but without creating it in hardware, what do i do, just delete all DG's and volumes, boot soalris and install on one disk and then once its installed run the raid zs configuration and it will create a raid itself?

sorry, i dont deal with solaris much at all. i got thrown in to this one.
SOLUTION
Avatar of Member_2_231077
Member_2_231077

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of msidnam

ASKER

To be honest i haven't dealt much with software RAIDs in over 15 years. Once they came out with RAID controller cards that had their own setup where i could make my RAID levels I've never gone back. I think the last time i did a software RAID was when i had the old Promise cards that would connect two or more disks and then you used their software to configure the RAID.

One thing i will need to do is check with the software vendor for the app we are installing to make sure they are compatible with it. They should be, but this vendor is very picky.

so, andyadler, you are sayinng just do a mirror on the OS and then use RAIDZ2 for the rest of the disks?

I guess now i need to go read a bunch of documents. oh what fun on my birthday.
The problem with a HW RAID controller on ZFS is that it does a flush-on-write. Write performance will be horrible, especially on a RAID6. Every single write, even a 512 byte I/O will require every block to be flushed to disk.    ZFS does this to insure data integrity, but the difference is that ZFS doesn't rewrite data.  Instead of updating a block that changes, it writes the new block somewhere else then once that write has been flushed, it flags the old data as free.

Now using the RAID1 on megaraid as boot only isn't all that bad, because you aren't really supposed to use the boot drive for storage pools.  Most people get two low-end, low capacity disks for the boot volume in solaris.

I think that system lets you turn the RAID off on selected disks. If so, that is what you want to do. The reason this even has a  megaraid is that  they position this system as more of a LINUX, VMWARE server, not a solaris ZFS server.
Avatar of msidnam

ASKER

In the GUI for the controller card I am able to select disks and then add them to a DG. I should just add two to a raid 1 and then use the other 6 for a raid z2 storage pool?
ASKER CERTIFIED SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
>The problem with a HW RAID controller on ZFS is that it does a flush-on-write.

The controller doesn't know what filesystem is in use, it will flush eventually of course but only when it feels like it as it has battery backed write cache. Mind you for the rip-off price that LSI charge for batteries you're probably better off selling the cache module and buying a couple more disks instead. £250 ($400) for a freaking RAID controller battery, if it was a big battery the price would make sense but less than 1000 mah.
ZFS does things differently.  Read up on the internals.  This is a nice little summary:
http://everycity.co.uk/blog/2013/11/raid-hardware-software-zfs/

One of the other things about ZFS is that it does not do or want pre-cache. If it needs 37 blocks, it asks for 37 blocks, if it needs to write 7 blocks, that is what it writes. But a HARDWARE RAID controller won't write 7 blocks, it has to do writes per the stripe size, so it might be forced to write 64 blocks at a time.  That also means it has to read 64 blocks at a time, and read it from all disks to calculate the XOR.  A hardware RAID controller results in doing much more I/O than ZFS.

As ZFS never rewrites data, the controllers cache is pretty useless.
>But a HARDWARE RAID controller won't write 7 blocks, it has to do writes per the stripe size,

Proof please. I've heard that claim multiple times but the only verification has been hearsay. Performance graphs don't show hardware RAID controllers suddenly slowing down for writes that are less than the stripe element size which would be the case if the statement was true.
Read this - go to page 7.   It talks about full stripe writes which is what most controllers will do, certainly enough to prove that controllers do this, and why controllers that do a read/modify/write are better.   [But not if doing  ZFS because ZFS doesn't do read-modify-write, by design]

But solaris does not do read/modify/writes. They write new data elsewhere, so all that extra pre-fetch is wasted and hurts performance.
 
http://www.xyratex.com/sites/default/files/Xyratex_White_Paper_RAID_Chunk_Size_1-0.pdf
P.S.  ZFS uses variable-sized blocks of up to 1MB. As you can imagine, hardware RAID controllers don't handle that well.   I do emphasize that the block size isn't going to be a power of two either, you can read/write prime numbers worth of blocks, whatever is needed at the time.


Also if a block can be compressed (and it is enabled at the file system level, it will write a smaller sized I/O.  - Giving better performance.  When you mirror disks in ZFS, both disks don't also  have the same data at the same  place.  This is also done for better performance (load balancing), and better data integrity and reliability

Such things also  are counter-intuitive to skilled sysadmins, but make sense.  These are additional reasons why RAID controllers need to be put in non-RAID mode, and made as stupid as possible to work well. (The ZIL, and cache are also other subjects for explanation, but I hope I proved my point)
That's certainly not proof, in fact it states that at least their controller does not write a full stripe or even a full stripe element (chunk). They spelt areal wrong on the first page, also the description of read-modify-write is incorrect although the drawing is correct.

I know ZFS is good (it's basically the same as WAFL), but that doesn't mean RAID controllers can't write less than a chunk/strip/stripe element. If they couldn't then RAID 10 performance would suddenly drop when writing less than a chunk as they would have to read the chunk before writing it to keep the rest of the chunk consistent with what was on it before. With RAID 5 and 6 that doesn't really matter since they have to read the data chunk first anyway but with RAID 10 it would be a big performance hit that just isn't there in any graph I've ever seen.