Solved

How to best partition a 250GB drive for CentOS 5.3 x86_64?

Posted on 2009-07-08
9
2,996 Views
Last Modified: 2013-12-15
Hello,

I'm relatively new to Linux, and I need help setting up a workstation that will be used to run VMware Workstation for lab scenarios and testing. The relevant hardware config is as follows: (1) 250GB and (4) 500GB drives and 16GB RAM. I've been reading about LVM and it sounds like something that I'd want to use on the 250GB drive, so here's how I was thinking of partitioning this drive:

/boot     512MB
/tmp      2048MB
swap     32000MB
/var       15000MB
/usr       20000MB
/home   80000MB
** approx. 100GB avail free space **

Again, I'm very new to Linux so any feedback/suggestions on the sizing of the above partitions would be appreciated. As for the (4) 500GB drives, I'm debating between RAID 0 or RAID 5. This will be my storage for all of my virtual machines. I'm leaning more toward RAID 5 since I don't have alternate storage to backup my vm's if I went with RAID 0. What filesystem should I use for this RAID?

Thank you
0
Comment
Question by:bndit
  • 5
  • 3
9 Comments
 
LVL 3

Accepted Solution

by:
samdart earned 88 total points
ID: 24807425
RAID 5 is better performing and is immune against single drive failure. It will give you 500GB x 3 = 1.5TB of space for the VM's.

With RAID 0 you might get a little better performance but a disk failure will knock out your entire data.

When you come to partitioning your drive for the OS, general thumb rule for swap partition is 2xRAM (So if you have 1GB RAM, set swap partition size to 2GB). Do you really need a 32GB swap partition? Rest looks okay.

From this link to FileSystem benchmarks http://www.debian-administration.org/articles/388

XFS appears to be the most appropriate filesystem to install on a file server for home or small-business needs :

    * It uses the maximum capacity of your server hard disk(s)
    * It is the quickest FS to create, mount and unmount
    * It is the quickest FS for operations on large files (>500MB)
    * This FS gets a good second place for operations on a large number of small to moderate-size files and directories
    * It constitutes a good CPU vs time compromise for large directory listing or file search
    * It is not the least CPU demanding FS but its use of system ressources is quite acceptable for older generation hardware
0
 
LVL 2

Author Comment

by:bndit
ID: 24807534
Thanks for the quick reply. As for the 32GB swap partition...I don't know...following that general rule...I'm assuming that the swap partition should be 32GB since I have 16GB RAM...now, do I need it? Probably not, so what size should I set it to? 16GB? less? more? Again, this is not a production server of any kind or even a server to that...it's merely a beefy workstation for lab scenarios and test environments.

As for the XFS I'm assuming you're suggesting that for my 1.5TB partition, correct? If so, here's a newbie question...do I format it during setup or can I format it later after CentOS has been installed?

thanks again
0
 
LVL 20

Assisted Solution

by:Daniel McAllister
Daniel McAllister earned 262 total points
ID: 24808186
In my experience, setting a separate "boot" partition makes good sense, but all those "old" *NIX books were designed back in the day when drives were small and much more prone to failure....

My systems (and I administer more than 30 Linux servers -- from RedHat 7.3 to RedHat 9, RHEL, CentOS, and more!) have a "significant" root partition, a swap partition (twice the size of my RAM), and the remainder is set on the /home partition...

In your case, with 16GB of RAM, it is HIGHLY unlikely that you'll need all that swap -- so a swap space = to your RAM size should suffice...

Next, while you've mentioned your hard drives (1x250 & 4x500), you haven't mentioned anything about RAID (which would be used to protect your data from corruption or loss in the event of a drive failure... IMHO, RAID5 is a waste -- drive space is CHEAP these days, and the performance hit you take on RAID5 just isn't worth it. (If you don't know ANYTHING about RAID levels, look here:
   http://www.intel.com/support/motherboards/server/sb/cs-010763.htm

My BEST preference is RAID1 (or RAID10 for very large storage!)... in your case, with the 4x500 drives, RAID10 would give you 1TB of "usable" and "protected" storage. If you need to know how to configure the system for Linux software RAID, look here:
   http://tldp.org/HOWTO/Software-RAID-HOWTO.html
BUT -- I generally prefer HARDWARE RAID to software RAID -- assuming SATA drives, I'm find of 3Ware (aka: AMCC) RAID cards (personal preference).

So, if I were to assume that you'll BOOT from the 250GB drive, and store data onto a 1TB RAID10 array that consists of all 4 500GB drives, (and changing my measurements to GB) here's what I would do...

/          16GB (on the 250GB drive)
swap   16GB (on the 250GB drive)
/home  1000GB (on the RAID array of 500GB drives)

as for the remaining space on the 250GB, you might want to make it a /backup or other partition -- perhaps to backup the truly "can't live without" files of your business.

Now, should you choose to format your 4 drives as RAID0 (and I HIGHLY RECOMMEND **NOT** TO DO THAT) you could place a 2TB fileystem on it... however, you should note that I am of the belief that 2TB is the filesystem size limit for ext3... but that's another discussion! If it was me, I'd format the / partition as EXT3, but the /home partition as XFS... but that's just me!

Should be enough to generate some discussion... how to layout your filesystems is every bit as "controversial" as Apple vs. Microsoft, or the choice of Linux distribution to start with! Everyone has their own opinion...

This is mine!

Dan
IT4SOHO
0
 
LVL 20

Assisted Solution

by:Daniel McAllister
Daniel McAllister earned 262 total points
ID: 24808587
I took my time writing the above, and so missed the other posts -- but there is an important question in the followup -- WHEN to format your "large" partition.

Before I do that, let me say that my most "common" Linux servers use RAID1 (not RAID-10) on 250GB drives... I install / onto partition 1 at 10GB, a 8GB swap (I only use 4GB of RAM), and the remainder is /home. I use a 3ware SATA RAID card and set the "twin" 250GB drives into a RAID1 configuration BEFORE I install CentOS.

Your situation is different in 2 ways:
1) your "boot" disk won't be in a RAID configuration, so you don't need to worry about RAID setup within the CentOS installer, and
2) you're apparently NOT going to be using hardware RAID, so you can choose to setup the 4x500GB array any way you like.

PLEASE READ THE PERFORMANCE DATA ON RAID LEVELS before installing a RAID5 array! As I noted above, RAID5 was "popular" when storage was expensive, and RAID5 has significant write issues.

--- Discussion of RAID5 vs RAID1 --- (Skip if you desire)

RAID1 is a "mirror" -- when data is written to a "sector" of the "RAID drive" (meaning the "logical" drive that has 2 physical parts), the data has to be written twice (once on each device). Clearly, data can be read from either device, so theoretically you can get some read performance gains -- but in truth, those gains are ONLY made with hardware RAID -- the Linux software RAID will generally read from one or the other drive and write to both, although performance gaining algorithms have been added).

RAID5 is a "stripe with parity" -- each "sector" of data is "spread out" across all-but-1 of the drives in the array, with the 1 not used being a "parity" drive. Parity in this instance works nearly exactly like parity we used to use in dial-up modems...
  using BINARY arithmetic:  0 + 1 + 0 + 1 = 0
So, RAID5 arrays must ALSO write to 2 different drives for each write: the DATA drive & the PARITY drive. (NOTE: different "sectors" will have different physical hard drives as the parity drive -- if all sectors have the same physical parity drive, you're really using RAID3 -- and there is a significant "bottleneck" on the parity drive.

There is a potential read performance gain in RAID5, as you can read from all of the drives simultaneously -- but this only really materializes on reading large files. Smaller file I/O loses this gain and drops to effectively JBOD (non-optimized disk) performance levels.

But wait... let's also look at RAID5 in the write mode -- When I write a sector of a drive, I have to update the data drive AND the parity drive... I already pointed that out... BUT what do you write to the parity drive??? It turns out, you have to know what was on the data drive AND the parity drive FIRST, before you can CALCULATE the data to put on the parity drive....
  the equation turns out to be:  Pnew = Pold - Dold + Dnew (where P is the data on the parity drive's sector & D is the data on the data drive's sector.
Fortunately, it turns out that hard drive hardware does a read of the sector anyway, so in hardware RAID5, there isn't much of a delay in the computation -- but in software RAID5, there can be a significant delay when writing "new" data. (Updating old data is faster, because you already knew the Dold value).

OK... so RAID1 is generally as fast or faster than RAID5 for all of the reasons above. The complaint against RAID1 is that it "eats" space -- whereas your 4-drive RAID5 array could store 1.5TB of space, the same drives in RAID1 will hold only 1TB of space. (it gets even worse with larger arrays.... an array of 6 500GB drives yields 2.5TB of data storage in RAID5, and only 1.5TB in RAID1...

However, before you dismiss RAID1, let's look at RAID performance when (not if, when!) there is a single drive failure:
 In RAID1, I just read from the GOOD drive -- no harm done, but I lose my ability to spread out my reads across the 2 drives. I effectively fall back to JBOD performance levels.
 In RAID5, I have to calculate the value for any missing data using all of the data from that sector on all of the other drives, including the parity drive. So, if drive 3 is the failed drive, anything that WAS stored on D3 now has to be computed. Assuming a 4-drive array:
  D3 = P - D1 - D2
That's 4 reads AND a computation to reconstruct the data on D3, and that is a MAJOR performance hit! (And it's even messier in a write to data that should be stored on the bad drive).

For a decent (if detail filled) report on hardware vs. software RAID in Linux, read here:
  http://linux.com/news/hardware/servers/8222-benchmarking-hardware-raid-vs-linux-kernel-software-raid
For a decent report on RAID1 vs RAID5 performance, read here:
 http://blogs.sun.com/mrbenchmark/entry/raid_1_vs_raid_56
NOTE: This is comparing HARDWARE RAID1 & HARDWARE RAID5, and you're jumping to part 6 in a 7-part report.

Finally, since you're storing "user data" on /home, and there won't be any users installed during the CentOS installation, you can choose to configure your RAID array (whatever type you choose) after the installation is done.

I hope this answers your questions!

Dan
IT4SOHO
0
How to run any project with ease

Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
- Combine task lists, docs, spreadsheets, and chat in one
- View and edit from mobile/offline
- Cut down on emails

 
LVL 2

Author Comment

by:bndit
ID: 24808593
Thanks Dan. Your reply contains a lot of useful information (info overload! - jk). I definitely have a hardware card for the 4 x 500GB drives, and as I stated in my original question I was debating between RAID 0 and RAID 5. I'm familiar with RAID setups (Windows), so you definitely bring a good point with RAID 10; especially if I won't have anything backing up the virtual machines that will be stored on this partition. I understand what you mean about the relevance of *old* *NIX literature...so 16GB for the swap partition it is. I have some questions about your suggested partitions and their sizes:

1) In my original question I forgot to include the / , which I would've given it 4GB...what's the reason behind giving it 16GB instead? I was under the impression that this partition didn't need to be large (>10GB).
2) You're suggesting to take the entire partition (1000GB) for /home (RAID 10), which is fine with me, but what are your thoughts around LVM for this and the partitions on the 250GB drive? Not worth the effort? Necessary? Overkill? I've read a little on LVM and people seem to have mixed feelings about it...on the one hand it seems *convenient* if you need to expand a partition without having to blow up your entire disk to repartition...while on the other hand some people claim that it's harder to recover data should the drive fail.

Finally, do you have any good links that explain the different file systems available? EXT3 for / and XFS for /home seem good choices, but I'd like to dive a bit more and understand the pros/cons of each.

Thanks again
0
 
LVL 20

Assisted Solution

by:Daniel McAllister
Daniel McAllister earned 262 total points
ID: 24809375
In your original config, you were allocating separate filesystems for /, /boot, /var, /tmp, and /usr... remember that in *NIX, a "mounted" filesystem just "overlays" a directory (or folder)... as a result, all the data you were going to put in /boot, /var, /tmp, and /usr will now (since those directories are no longer "mount points" for separate filesystems) be stored in the / filesystem. Thus, the need for extra space.

There are some proponents of putting EVERYTHING on / (thus, partitioning a single-drive system into just 2 partitions: / and swap) -- and that could work for you (partition swap to 16GB, and allocate ALL of the rest to /... I would not have a problem with that, given that you're going to be using such a large RAID array for your "user" data (/home).

Will you use anywhere close to even 16GB on the / filesystem?? Hopefully not -- and in fact, your biggest "risk" will be that the /var directory will get overgrown -- which is the reason for the "overgrown" / partition to begin with.... if you write 16GB of log files and you never catch it in your "routine" maintenance, then you DESERVE to have a system error! (Seriously, 16GB of log files would be ASTRONOMICALLY huge -- even if you check your system logs only MONTHLY (I check mine daily), you'll catch it long before it makes it a problem.).

Let's understand the difference between LVM and RAID:
 LVM (Logical Volume Management) is designed to allow you to "span" (or "stripe") data on different physical media -- REGARDLESS of the performance or physical nature of those media. Say you have a collection of IDE, SATA, & SCSI devices -- with LVM, you can treat them all as a single "logical" drive. The other "common" use of LVM is to create "on-the-fly" RAID1 arrays -- primarily for backup purposes (mirror the drive to another drive using LVM, then "break" the mirror & remove the drive for use as a backup). Another (more common) use of LVM is to create a RAID array using different-sized disks... because LVM combines multiple "partitions" into "logical volumes", not multiple DISKS.
 RAID, as a technology, works on the DISK level, not the partition level. Thus, with RAID, you generally work with multiple disks of the same size & geometry (usually accomplished by using drives that are the same model & manufacturer!).

So, with LVM, you could partition your 4 500GB drives into 5 equal partitions of 100GB, then make several RAID5 arrays with multiple partitions from the different drives (NOTE: It would be FOOLISH to use more than 1 partition from the same hard drive in one of these arrays!)

With RAID, you select the RAID level you want for each DRIVE to participate in (in fact, you could add a 5th 500GB drive to most hardware arrays and configure it to be a "spare" for your RAID (regardless of RAID 1, 5, or 10).

As for the "specs" for the different filesystems, try this for data overload!:
 http://en.wikipedia.org/wiki/Comparison_of_file_systems

For an ext3 vs. XFS discussion, read here:
  https://lists.ubuntu.com/archives/ubuntu-us-ut/2008-March/000833.html

And now, I'm going to eat my dinner!

Dan
IT4SOHO

0
 
LVL 2

Author Comment

by:bndit
ID: 24828353
You weren't kidding about the info overload Dan...but thanks a lot, I really appreciate all of your input, which has been very helpful. However, I'm at a roadblock right now...samdart above had suggested that XFS filesystem would be best fit for file servers. I'm at the Disk Druid window, but there's no XFS option and I've been searching Google and it seems that CentOS/RHEL does not support XFS...at least CentOS 5.3 doesn't appear to do...so 1) Could you verify that this is the case? 2) is EXT3 an alternative for XFS?

Thanks.
0
 
LVL 2

Author Comment

by:bndit
ID: 24828366
...forgot to ask...what other Linux distros support XFS? and which ones do you recommend? I was hoping to stick to RHEL/CentOS/Fedora...but I'm open to other ones. Thx
0
 
LVL 2

Author Closing Comment

by:bndit
ID: 31601212
Thanks
0

Featured Post

Highfive Gives IT Their Time Back

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

Join & Write a Comment

I. Introduction There's an interesting discussion going on now in an Experts Exchange Group — Attachments with no extension (http://www.experts-exchange.com/discussions/210281/Attachments-with-no-extension.html). This reminded me of questions tha…
Create your own, high-performance VM backup appliance by installing NAKIVO Backup & Replication directly onto a Synology NAS!
Learn how to get help with Linux/Unix bash shell commands. Use help to read help documents for built in bash shell commands.: Use man to interface with the online reference manuals for shell commands.: Use man to search man pages for unknown command…
Learn how to navigate the file tree with the shell. Use pwd to print the current working directory: Use ls to list a directory's contents: Use cd to change to a new directory: Use wildcards instead of typing out long directory names: Use ../ to move…

743 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

11 Experts available now in Live!

Get 1:1 Help Now