Go Premium for a chance to win a PS4. Enter to Win


RHEL 5.2 very slow on RAID 5 and RAID 6

Posted on 2008-10-24
Medium Priority
Last Modified: 2012-05-05

I have production server on RHEL 5.2 (Linux 2.6.18-92.el5 #1 SMP Tue Apr 29 13:16:12 EDT 2008 i686 i686 i386 GNU/Linux) which has LOCAL disk array on 15k SAS drives in RAID 5 array, and SAN storage on same SAS 15k drives on RAID 6.
SAN is connected via HBA FC addapter, 4 MBps speed.
Machine is actually Industry Standard Server, with 8 GB ECC RAM and 2 x QuadCore XEON processor, real HW RAID controller etc...which makes it quite a beast.

Now, we are running only 1 single application on it, which utilises JAVA on TOMCAT platform, and all 8 CPU cores are most of the time just stratching the floor - under 1% CPU utilisation average.

Partitions are all configured with LVM, for the whole purpose of beeing able to expand when needed.

And teh PROBLEM?
Here: when I run "du", for example, on local RAID 5 partition (600 GB), it takes 2 hours for command to complete! While "du" running, CPU's are almost idle, under 1%, only disks I/O are running on full. Issuing "du" on SAN partition (RAID 6) it is a bit faster, but it still takes 45 minutes to finish scanning 400 GB of files.
Also, when this single application, which runs on server, is trying to reindex all the files on RAID 5 and RAID 6 array to update info of all files, it takes 2 days or more, and all services are disabled at that time, while other users of the software report this same task to be finished within few hours, not days!

I then run disk benchmark, and it shows almost 200 MB/s file copy speeds, which is great.

So I am lost between fast and mostly idle CPU's, very fast disk arrays, and on the other side very poor and actually not acceptable performance of disk intensive operations.

Looking for idea of how to find the bottleneck of the system. Suggestions welcome.
Question by:Andrej Pirman
  • 2
LVL 80

Expert Comment

ID: 22798835
The only time you seem to be encountering a bottleneck of any kind is when you run du which as you said is very disk I/O intensive.
How frequently do you need to run du?

Do other users of this application have the same scope of data processed, i.e. 1 Terabyte of data?  Are the RAID partitions healthy (no failed drives)?
Do the others have the same quantity of files?
Is the reindexing mechanism optimally configured, or event identically configured among the various users of the software?
LVL 18

Author Comment

by:Andrej Pirman
ID: 22802105

I actually do not need to run "du" at any time, because there are other methods of determining actual disk usage. "du" was just a test to determine disk speed and to illustrate slow I/O in my system.

Regarding application's internal database reindexing, I assume process is kinda optimised, since other administrators, which host many more clients as I do, do NOT report even near so long-lasting reindexing process - most of them report reindexing to be finished in couple of hours, not days. And to be even more weird, most of others do not have such a good server, which points to a conclustion, that there is a configuration mistake or some other botleneck in my system.
LVL 80

Accepted Solution

arnold earned 2000 total points
ID: 22803470
You seem to go on a premise rather than a qualitative analysis.
A reindex process can take two months or longer, but as long as the recent and the just added documents were indexed first, no one will notice how long the reindexing process takes which is where the "kinda optimised" mechanism comes into play. An added mechanism could be cached queries which could expedite the indexing by including those documents in the reindex mechanism.

The question is how much time passes from the initiation of the reindex and the return of some of the functionality?

Regarding defining  your server as better than the others, as you've noted, the bottleneck is in the Disk I/O.
So a server with Dual single Core Xeon 2.8GHz processor with a similar RAID configuration will likely run the same as yours.

Do the others perhaps have a RAID 10 setup?  May be they configured their setup slightly differently i.e. instead of having two huge partitions (400GB and 600GB) may be they separated the content into more designated paritions. instead of a single 600 GB they have 5 120GB with each partition defined to include a specific set of content.

Do they also use LVM on their RAID allocated partitions?

What is the application that is being referenced here?


Featured Post

Get free NFR key for Veeam Availability Suite 9.5

Veeam is happy to provide a free NFR license (1 year, 2 sockets) to all certified IT Pros. The license allows for the non-production use of Veeam Availability Suite v9.5 in your home lab, without any feature limitations. It works for both VMware and Hyper-V environments

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Data center, now-a-days, is referred as the home of all the advanced technologies. In-fact, most of the businesses are now establishing their entire organizational structure around the IT capabilities.
Google Drive is extremely cheap offsite storage, and it's even possible to get extra storage for free for two years.  You can use the free account 15GB, and if you have an Android device..when you install Google Drive for the first time it will give…
Learn how to navigate the file tree with the shell. Use pwd to print the current working directory: Use ls to list a directory's contents: Use cd to change to a new directory: Use wildcards instead of typing out long directory names: Use ../ to move…
How to Install VMware Tools in Red Hat Enterprise Linux 6.4 (RHEL 6.4) Step-by-Step Tutorial
Suggested Courses

885 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question