centos - large data LVM questions

Posted on 2014-10-08
Last Modified: 2014-11-17
I have to put together a storage system for 20TB - 50TB data with a CentOS head server.

Can someone explain the best way to provision and configure the storage?

The data will be almost exclusively read-only, with growth of 3-5TB/yr.

I will have a SAS-connected SAN with either 1TB or 2TB disks.  The SAN has a max of 16 disks in a RAID5 (or RAID6) array, so I assume I'll need more than 1 array at the storage level.

This means several LUNS presented to the OS.

Assuming I use the 1TB disks, and create 3 different RAID 5 arrays, I'll have 45TB of usable space.

Do I create smaller LUNS to present to the OS, and LVM to create a single large storage pool?

Does the entire Volume Group max out at 16TB?  or is each LV a max of 16TB?
Question by:snowdog_2112
  • 5
  • 4
  • 3
  • +1
LVL 34

Expert Comment

by:Seth Simmons
ID: 40369965
...I'll have 45TB of usable space

how do you figure?
if that san supports up to 16 disks, you will only have some 14tb usable with 1tb drives; about 28tb with 2tb drives and a bit less of both if raid 6

Does the entire Volume Group max out at 16TB?

depends on what version and what file system
RHEL/CentOS 6 has a 16tb file system limit for ext3/4 (due to version of e2fsprogs package)
you can format xfs beyond 16tb which should be very good for mostly static data

the more you break up your raid groups, the less usable space you will have since you will accumulate more parity drives

Author Comment

ID: 40370533
Thanks for the response - let me clarify my questions:

The san supports 16 disks in a *single* raid5 array, not 16 disks total.  I will need expansion shelves to accommodate  48 (or more with hot spares), but can configure 3 arrays of 16 disks.

I'm more concerned with how to address the disk space from the OS.

Assuming I have 45TB of usable disk space - how do I best configure that in the OS?

Will I need several LVM's smaller than 16TB and break up the data?
LVL 61

Expert Comment

ID: 40372305
Best is not to allocate all at the same time. You can add LUN of say your data +50%, then extend it AS/IF needed.
Why dont you get something like FreeNAS with de-duplicating filesystem, and NFS support so you get maximum kick from your storage?

Author Comment

ID: 40375920
Same question doesn't matter *when* I break the 16TB barrier.  I *will* break it.

The initial data seed will be >16TB (didn't mention in OP, sorry), so I need more than 16TB right away.

The question is - *HOW* do I extend it past 16TB, not AS/IF I will need to.


Assisted Solution

pitoren earned 500 total points
ID: 40376073
Your question is not entirely clear.

As long as you are on 64 bit CPU and recent OS/software versions you are not going to hit lvm limits.

If you had

3x  (16x1TB disks in RAID5), you could create 3x 15TB PVs, create a single VG, and a single LV. Then you need a file system and XFS  would be OK.

I would probably personally prefer 8 disks per RAID5 LUN. And multiple dual port HBAs for both multi patching and load-balancing. For this to work best, you need to check how your SAN storage supports LUN failover.

And none of this will work that great if you have zillions of tiny files.  You need files of a decent size.

LVL 61

Expert Comment

ID: 40376335
CentOS 7 will default to XFS, that solves all your concerns.
U.S. Department of Agriculture and Acronis Access

With the new era of mobile computing, smartphones and tablets, wireless communications and cloud services, the USDA sought to take advantage of a mobilized workforce and the blurring lines between personal and corporate computing resources.


Author Comment

ID: 40406876
Thanks for the replies - you may have answered my question without knowing.

In short - I need *more than 16TB* usable space in a single "directory" - which may contain a single DB larger than 16TB or a "zillion" smaller files.

pitoren: I can create 3 PV's (i.e. the storage presents the LUN's to the OS) of 15TB, then a single volume group from the PV's, and a single Logical Volume of 45TB, but I need to format the partition with XFS.
(the storage is a dual-controller SAN with multi-pathing to the OS, and the SAN supports up to 16 disks per RAID-5 LUN).

gheist: you're saying if my OS is CentOS 7,  it will default to XFS for such a LV.

Please confirm - I'll break out points
LVL 61

Expert Comment

ID: 40407894
CentOS7 defaults to XFS for any install. You can still choose ext2 if you think filesystem log is a virtue, or ext4 to stay in stone age.

Accepted Solution

pitoren earned 500 total points
ID: 40408065

I think ghesit got his ext2 and ext4 mixed up there.

In RHEL7 the 16TB limitation has been removed for ext4, so you can use xfs or ext4 for RHEL7.

RHE6 your only options is xfs.

I'd strongly suggest to go with xfs - Redhat support it to 500 TB or something. In RHEL7 its their default filesystem.

Be real careful if you have a single directory with zillions of small files - every time a file is created or deleted the directory inode has to be updated. There's a limit how fast things can go in that scenario.

good luck
LVL 61

Expert Comment

ID: 40408069
He wants to exceed 50TB, so RHEL7 EXT4 will not suffice.

Expert Comment

ID: 40408100
gheist: That's not written in the post, which starts with

"I have to put together a storage system for 20TB - 50TB data with a CentOS head server."

and in another post the OP suggest he\ll use 3x15TB LUNs.

But as I said, I'd suggest to use xfs, with the caveats written. You can obviously grow both xfs and ext4, to their respective limits, if more storage becomes availalble needed.

And think about how you are going to back the data up.  (If you think "I don't need backups" I don't want to be in your shoes if something goes wrong).

LVL 61

Expert Comment

ID: 40408168
Anyway they need to partition data.. About data locality, map-reduce and all that stuff.

Author Closing Comment

ID: 40448068
Thanks for the replies - sorry for my delays.  This has been a back-burner/need-it-now/back-burner case.


Featured Post

Scale it in WD Gold

With up to ten times the workload capacity of desktop drives, WD Gold hard drives employ advanced technology to deliver among the best in reliability, capacity, power efficiency and performance.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
RAID, not sure what Type ?? 14 124
Clone from Hd to smaller SSD 87 167
Python variable _ manually assigned 9 61
How to update  and reset admin password for Linux 5 34
SSH (Secure Shell) - Tips and Tricks As you all know SSH(Secure Shell) is a network protocol, which we use to access/transfer files securely between two networked devices. SSH was actually designed as a replacement for insecure protocols that sen…
Join Greg Farro and Ethan Banks from Packet Pushers ( and Greg Ross from Paessler ( for a discussion about smart network …
Learn how to navigate the file tree with the shell. Use pwd to print the current working directory: Use ls to list a directory's contents: Use cd to change to a new directory: Use wildcards instead of typing out long directory names: Use ../ to move…
This Micro Tutorial will teach you how to reformat your flash drive. Sometimes your flash drive may have issues carrying files so this will completely restore it to manufacturing settings. Make sure to backup all files before reformatting. This w…

932 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

11 Experts available now in Live!

Get 1:1 Help Now