centos - large data LVM questions

Posted on 2014-10-08
Last Modified: 2014-11-17
I have to put together a storage system for 20TB - 50TB data with a CentOS head server.

Can someone explain the best way to provision and configure the storage?

The data will be almost exclusively read-only, with growth of 3-5TB/yr.

I will have a SAS-connected SAN with either 1TB or 2TB disks.  The SAN has a max of 16 disks in a RAID5 (or RAID6) array, so I assume I'll need more than 1 array at the storage level.

This means several LUNS presented to the OS.

Assuming I use the 1TB disks, and create 3 different RAID 5 arrays, I'll have 45TB of usable space.

Do I create smaller LUNS to present to the OS, and LVM to create a single large storage pool?

Does the entire Volume Group max out at 16TB?  or is each LV a max of 16TB?
Question by:snowdog_2112
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 5
  • 4
  • 3
  • +1
LVL 35

Expert Comment

by:Seth Simmons
ID: 40369965
...I'll have 45TB of usable space

how do you figure?
if that san supports up to 16 disks, you will only have some 14tb usable with 1tb drives; about 28tb with 2tb drives and a bit less of both if raid 6

Does the entire Volume Group max out at 16TB?

depends on what version and what file system
RHEL/CentOS 6 has a 16tb file system limit for ext3/4 (due to version of e2fsprogs package)
you can format xfs beyond 16tb which should be very good for mostly static data

the more you break up your raid groups, the less usable space you will have since you will accumulate more parity drives

Author Comment

ID: 40370533
Thanks for the response - let me clarify my questions:

The san supports 16 disks in a *single* raid5 array, not 16 disks total.  I will need expansion shelves to accommodate  48 (or more with hot spares), but can configure 3 arrays of 16 disks.

I'm more concerned with how to address the disk space from the OS.

Assuming I have 45TB of usable disk space - how do I best configure that in the OS?

Will I need several LVM's smaller than 16TB and break up the data?
LVL 62

Expert Comment

ID: 40372305
Best is not to allocate all at the same time. You can add LUN of say your data +50%, then extend it AS/IF needed.
Why dont you get something like FreeNAS with de-duplicating filesystem, and NFS support so you get maximum kick from your storage?
Windows Server 2016: All you need to know

Learn about Hyper-V features that increase functionality and usability of Microsoft Windows Server 2016. Also, throughout this eBook, you’ll find some basic PowerShell examples that will help you leverage the scripts in your environments!


Author Comment

ID: 40375920
Same question doesn't matter *when* I break the 16TB barrier.  I *will* break it.

The initial data seed will be >16TB (didn't mention in OP, sorry), so I need more than 16TB right away.

The question is - *HOW* do I extend it past 16TB, not AS/IF I will need to.


Assisted Solution

pitoren earned 500 total points
ID: 40376073
Your question is not entirely clear.

As long as you are on 64 bit CPU and recent OS/software versions you are not going to hit lvm limits.

If you had

3x  (16x1TB disks in RAID5), you could create 3x 15TB PVs, create a single VG, and a single LV. Then you need a file system and XFS  would be OK.

I would probably personally prefer 8 disks per RAID5 LUN. And multiple dual port HBAs for both multi patching and load-balancing. For this to work best, you need to check how your SAN storage supports LUN failover.

And none of this will work that great if you have zillions of tiny files.  You need files of a decent size.

LVL 62

Expert Comment

ID: 40376335
CentOS 7 will default to XFS, that solves all your concerns.

Author Comment

ID: 40406876
Thanks for the replies - you may have answered my question without knowing.

In short - I need *more than 16TB* usable space in a single "directory" - which may contain a single DB larger than 16TB or a "zillion" smaller files.

pitoren: I can create 3 PV's (i.e. the storage presents the LUN's to the OS) of 15TB, then a single volume group from the PV's, and a single Logical Volume of 45TB, but I need to format the partition with XFS.
(the storage is a dual-controller SAN with multi-pathing to the OS, and the SAN supports up to 16 disks per RAID-5 LUN).

gheist: you're saying if my OS is CentOS 7,  it will default to XFS for such a LV.

Please confirm - I'll break out points
LVL 62

Expert Comment

ID: 40407894
CentOS7 defaults to XFS for any install. You can still choose ext2 if you think filesystem log is a virtue, or ext4 to stay in stone age.

Accepted Solution

pitoren earned 500 total points
ID: 40408065

I think ghesit got his ext2 and ext4 mixed up there.

In RHEL7 the 16TB limitation has been removed for ext4, so you can use xfs or ext4 for RHEL7.

RHE6 your only options is xfs.

I'd strongly suggest to go with xfs - Redhat support it to 500 TB or something. In RHEL7 its their default filesystem.

Be real careful if you have a single directory with zillions of small files - every time a file is created or deleted the directory inode has to be updated. There's a limit how fast things can go in that scenario.

good luck
LVL 62

Expert Comment

ID: 40408069
He wants to exceed 50TB, so RHEL7 EXT4 will not suffice.

Expert Comment

ID: 40408100
gheist: That's not written in the post, which starts with

"I have to put together a storage system for 20TB - 50TB data with a CentOS head server."

and in another post the OP suggest he\ll use 3x15TB LUNs.

But as I said, I'd suggest to use xfs, with the caveats written. You can obviously grow both xfs and ext4, to their respective limits, if more storage becomes availalble needed.

And think about how you are going to back the data up.  (If you think "I don't need backups" I don't want to be in your shoes if something goes wrong).

LVL 62

Expert Comment

ID: 40408168
Anyway they need to partition data.. About data locality, map-reduce and all that stuff.

Author Closing Comment

ID: 40448068
Thanks for the replies - sorry for my delays.  This has been a back-burner/need-it-now/back-burner case.


Featured Post

Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
Java array 21 147
CentOS Backup Options 3 62
grep command usage 10 25
SSD to USB-C Adapter 8 54
Create your own, high-performance VM backup appliance by installing NAKIVO Backup & Replication directly onto a Synology NAS!
In this article we will learn how to backup a VMware farm using Nakivo Backup & Replication. In this tutorial we will install the software on a Windows 2012 R2 Server.
Learn how to get help with Linux/Unix bash shell commands. Use help to read help documents for built in bash shell commands.: Use man to interface with the online reference manuals for shell commands.: Use man to search man pages for unknown command…
Learn how to find files with the shell using the find and locate commands. Use locate to find a needle in a haystack.: With locate, check if the file still exists.: Use find to get the actual location of the file.:
Suggested Courses

737 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question