Want to win a PS4? Go Premium and enter to win our High-Tech Treats giveaway. Enter to Win


Unusual Disk Read From Primary Disks

Posted on 2003-10-31
Medium Priority
Last Modified: 2013-12-15

We have a server running Linux Redhat 9.0
Kernel: 2.4.20-20.9bigmem #1

External Raid
and two internal drives RAID-0 (using the same cable)

and all our data and applications is mounted on the external raid

the primary drive is for the root file system only to be more prcesisi here is a output of df

Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/md2              34336980   1965536  30627192   7% /
/dev/md0               2063440     65396   1893228   4% /boot
none                   3096472         0   3096472   0% /dev/shm
tiny:/home            75854568  27651104  44350280  39% /ftp
//fax/doze            77199360  50599936  26599424  66% /doze
/dev/Volume00/lvol1  203147960  64257220 128404980  34% /appl
/dev/Volume00/tmp     10485436    113016  10372420   2% /tmp
/dev/Volume00/proedi  10485436    867228   9618208   9% /appl/proedi
                      52427200  34969832  17457368  67% /appl/backups/sys3
                      36298264   3120520  31333888  10% /appl/backups/sys4b

our usual load average is between 2 - 4, now and then it would peak to 10 but usualy comes down with on minitues.

since couple of weeks ago we started having a small problem, it seems every 7 days (which is another intresting factor) the disk reads from primary disks would increase significantly, causing a i/o wait for all other web applications we are running off our webserver (shell scripts n stuff) , we are yet to find what exactly is causing all these reads and why is it always 6-7 days after a reboot. we always have no other choice than having to reboot the server when this happens. since the increase read time from dev3-0 and 1  would go upto 200 to 300 blks/s causing everything else to hang

dev3-0 and 1 is our primary drives using IDE and they are on the same IDE cable

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
dev3-0            4.50        15.21        53.90      35564     126016
dev3-1            4.51        16.27        53.90      38032     126016
dev8-0          131.01      1928.25       937.88    4507854    2192568
dev8-1          131.19      2000.76       937.88    4677356    2192568

dev8 is the external raid

well i hope i have given enough information to visualize the senario if not i am more than willing to provide some extra information.. if anyone out there have encounterd and problem as this or knows something that could cause a situation like this all comments are greatly apriciated.



Question by:DevZer0
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 6
  • 5
  • 2

Author Comment

ID: 9659299
Well here is some more info.. we do have a swap parition on our primary drives but it is set to turn off when the RAID comes to life. since there is a swap in the RAID.. but we can;t know for sure if the primary swap is actually not being used or not.. as for my knowladge using free -m will not show what swap is being used well that was one theory that i came up with my no evidence to confirm it yet.

LVL 12

Expert Comment

ID: 9661754
For swap normally there is/are entry in the fstab, and show the partitions used for swap, eg.

/dev/hda7               swap                    swap    defaults        0 0     <-- this mean the primary HDD partition#7 is for swap, so everytime you start/restart your

linux distro, the system will read the fstab and enable the swap for you.
LVL 12

Assisted Solution

paullamhkg earned 500 total points
ID: 9661785
There may have lots of reason why your harddisk i/o increase, if someone try to scan you web server for hacking which might increase your i/o, because they use many way to scan your server which make your system busy, and your system is install into your dev3-0 and 1.

Your application request the system resouce and create lots of system I/O may also the reason, check you web application to ensure it's not the case.

I will suggest to restart the web service each day or twice per week, before you find out the reason.

I do not mean to restart/reboot the server, I mean restart the web/http service, the way to do is just as below

/usr/bin/apachectl stop      <--- which will stop the web/http service
/usr/bin/apachectl start      <--- which start up the web/http service without the ssl
/usr/bin/apachectl startssl   <--- which start up the web/http serice with ssl

I've an application running in our web server, which create lots of dead child, and which hang up my web 10-20 days, my programmer still looking for the problem, but the restart http service which kill all the dead child and keep my web server running, I know this is not a solution, but at lease I can keep my server alive.

I don't know the restart service will help you or not but at lease you can try.
Get your Disaster Recovery as a Service basics

Disaster Recovery as a Service is one go-to solution that revolutionizes DR planning. Implementing DRaaS could be an efficient process, easily accessible to non-DR experts. Learn about monitoring, testing, executing failovers and failbacks to ensure a "healthy" DR environment.

LVL 40

Expert Comment

ID: 9662821
If this is happening every 7 days I'd be inclined to look for a cronjob doing something at weekly intervals. In a standard RH 9 configuration only the rpm log is rotated weekly (see /etc/logrotate.d/rpm), but you could have configured something else to rotate logs (like web server logs) on a weekly basis.

Author Comment

ID: 9663451
Hi thanks for the comments, much apriciated.. i checked the cron weekly and and we do have weekly cronjobs but they are nothing to do with the primary drives... apache is also on our external raid, pretty much whats on the root file system is

/var      /mnt
/etc      /opt
/usr     /root
/bin     /sbin
/boot     /snap

and i know for a fact var has whole bunch of logs, as in ftp logs and telnet logs and other error logs.. but as far as apache logs and our application logs goes they are all in the external raid

LVL 40

Expert Comment

ID: 9663913
Do the log dates on /var/logs match up with when the disk load occurs? How about the apache logs?

Note that activity on the system devices could be a result of the use of swap space or temp areas, even though Apache, DB's, whatever, are located on the external RAID. You can check to see what devices are actually being used for swap with 'swapon -s'.

Author Comment

ID: 9671715
apache logs are also in the external RAID and the problem is increase of primary disk read activity not writes. so even though if it was a small log that is being written in /var/log (not apache logs nor other application logs because they are written to the raid) woulodnt cause a increase of disk read time but them the two primary disks are Mirrored so that is the only cause that i could think that something is actually trying to read stuff...

LVL 40

Expert Comment

ID: 9671966
Well, /tmp and swap are on the primary disks and anything that used those resources would cause disk activity on the primaries, even though the application and its data resides on the external RAID. The key to solving this is going to be in finding a correlation between the high disk I/O on the primaries and something that happens every 7 days. Once you know what triggers the I/O it then becomes a matter of figuring out why it beats on the primary disks.

Author Comment

ID: 9672109
/tmp and swap is not on the primary disks
LVL 40

Expert Comment

ID: 9672775
Oops I missed that /tmp is on the RAID. Have you verified that all of swap isn't on a primary with 'swapon -s'? Could something be using /var/tmp?

Author Comment

ID: 9674288
----- swapon -s results -----

/dev/Volume00/swap1             partition 2097144 29292 -2
/dev/Volume00/swap2             partition 2097144 0 -3
/dev/Volume00/swap3             partition 2097144 0 -4
/dev/Volume00/swap4             partition 2097144 0 -5
/dev/Volume00/swap5             partition 2097144 0 -6
/dev/Volume00/swap6             partition 2097144 0 -7

looks like only RAID swaps are active and /var/tmp is not used by anything infact its empty as well...

any other ideas ?

Author Comment

ID: 9674303
We switched the cable which was used by the primary disks over this weekend... Since both Primary drives were sharing the same cable and since they are mirrored as well our most resonable theory was that since both Disks are sharing the same IDE Controler and the CABLE it could increase I/O time when a mirror operation is in progress. but we are yet to see the results... so far no problems have noticed...

LVL 40

Accepted Solution

jlevie earned 500 total points
ID: 9675482
I assume you are using RAID 1 (mirroring) and having each disk as a master on different IDE controllers is a good thing, reliability wise. Using two controllers is also good w/respect to disk I/O rates and that should help. But it still doesn't explain what's happening or why. I think you are going to have to look carefully at what happens on a seven day period for the case.

Featured Post

Prepare for your VMware VCP6-DCV exam.

Josh Coen and Jason Langer have prepared the latest edition of VCP study guide. Both authors have been working in the IT field for more than a decade, and both hold VMware certifications. This 163-page guide covers all 10 of the exam blueprint sections.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

rdate is a Linux command and the network time protocol for immediate date and time setup from another machine. The clocks are synchronized by entering rdate with the -s switch (command without switch just checks the time but does not set anything). …
I am a long time windows user and for me it is normal to have spaces in directory and file names. Changing to Linux I found myself frustrated when I moved my windows data over to my new Linux computer. The problem occurs when at the command line.…
Learn several ways to interact with files and get file information from the bash shell. ls lists the contents of a directory: Using the -a flag displays hidden files: Using the -l flag formats the output in a long list: The file command gives us mor…
Learn how to find files with the shell using the find and locate commands. Use locate to find a needle in a haystack.: With locate, check if the file still exists.: Use find to get the actual location of the file.:
Suggested Courses

610 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question