What is an optimal raid stripe size and ext3 stride size for a raid 1 disk array?

We operate a web photo server (Dell 1950 w/ md1000 and PERC5/E card)  The OS is Centos5 x64  

The photo server is apache.  Photos are stored on a raid1 disk array that does 90% reads, 10% writes.
Each photo has 3 sizes - thumbnail 6-8K, medium 16K-25, large 40K-100K.  Thumbnails are accessed far more frequently than other types.

There really 3 questions What is an ideal strip size for the raid array.  What ext3 stride size should be used to format the array for it to be optomized to the stripe size.

Finally, is there a linux disk bench mark tool that can disable all disk, controller and OS level caching and then randomly read files from the disk and give out put times?

Who is Participating?
I would turn off write-back cache, it's risky since data isn't written directly to disk. That may not matter if your pics are static though.

If you can split the thumbs, medium and large into different arrays it would be a good idea to use a stripe element size that's just bigger than a file, so 8K for the thumbs, 32K for the medium and 128K for the large, then a single I/O can retrieve a whole file. I think your controller allows multiple arrays on a single container of disks. You can set it all to 128K though, and set the stride size different on 3 different ext3 partitions.

"The drive calculation works like this: You take the number of disks and multiply it by the chunk size of the raid array. This gives you your stripe size. Then you take the stripe size, and divide it by the number of blocks in the filesystem. This gives you the stride value to use when formating the volume. This can be a little complex, so some examples are listed below. "

It's not only complex it's nonsense, the number of blocks in the file system is millions, from the examples you can see they are using 4K which is the block size, not the number of them.

"If it was 4 disk RAID0 array, than it would be 64(4x64k/4k=64). If it was 4 disk RAID10 array, than it would be 32 ((4/2)*64k/4k=32)" makes more sense,

2 disk RAID 1[0] array, 16K chunks gives ((2/2)*16K/4K)=4.

So the stride sise should be the number of 4K blocks in the stripe element size.

Why they can't say that instead of multiplying the stripe element size by the number of disks to get what they refer to as stripe size and then dividing it by the number of disks again? It just complicates it. You just divide the stripe element size by 4KB so for 16k element size you get 4 as the stride.

With NTFS you set the cluster or allocation unit size to match the stripe size (or a binary fraction of it) since at least one cluster is retrieved in its entirity for a file read so there's only a single I/O and it's the same for ext3 (and every other filesystem) except that NTFS refer to a cluster by the size of it and ext3 refers to a stride as the number of 4k blocks.

http://storageadvisors.adaptec.com/2006/06/05/picking-the-right-stripe-size is worth a read.
Use smaller stripe size if you want to save some space.
Use larger stripe size if you want to increase performance but you will waste some space

Your files are kind of small with only 10% of writes. I would go with 16k or 32k stripe size to get a good balance of power and economy.
burnsj2Author Commented:
Thanks, that helps some, can you expand on the ext3 block size and ext3 stride sizes and how they relate to the RAID stripe size.

According to the centos site http://wiki.centos.org/HowTos/Disk_Optimization that relationship is pretty important for performance, and complex.

For a 16k stripe what would the optimal ext block and  ext3 stride sizes be?
Network Scalability - Handle Complex Environments

Monitor your entire network from a single platform. Free 30 Day Trial Now!

Beware, what the link above referrs to as stripe size is what most manufacturers refer to as stripe width and what they refer to as chunk size is what manufacturers refer to as stripe size.
Ya, Andy... I was a little confused after that link.

I would use smaller chunks as stripe sizes but that "format thing" is new to me :-)

burnsj2Author Commented:
I'm glad I'm not the only one confused by this.   The PERC Card reports this for my virtual disk.
Controller PERC 5/E Adapter (Not Available)
ID                  : 0
Status              : Ok
Name                : ext00
State               : Ready
Progress            : Not Applicable
Layout              : RAID-1
Size                : 136.13 GB (146163105792 bytes)
Device Name         : /dev/sdb
Type                : SAS
Read Policy         : Adaptive Read Ahead
Write Policy        : Write Back
Cache Policy        : Not Applicable
Stripe Element Size : 16 KB

With a stripe element size of 16 KB  What block and stride should I set during the Ext3 format?

How much memory does the server have?  How many unique file reads do you have?  How much memory is typically used for file cache (amount reported as "cached" in top)?

Linux caches as many files in memory as possible.  Although read performance is important, if you have enough memory and you do not have a lot of unique reads, its not  that important.

I have seen systems with 2GB of RAM and they were using 1.5GB for file cache.  Based on your files sizes that is a lot of files that you would read once from disk and never read again.  The more RAM the more files you can cache, the less you need to do real I/O.
burnsj2Author Commented:
It has 2GB of RAM, the OS is only using about 256MB so the rest is available for file cache.  We can add ram as needed, but right now I'm trying to make sure I configure my RAID arrays optimally.  The arrray of photos is 200GB and growing daily.  Ram caching alone will never be a sufficient solution.
burnsj2Author Commented:
additionally there are 4 million photos each with 3 versions (thumbnail, medium and large)  so the total number of files is around 12 million
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.