Link to home
Start Free TrialLog in
Avatar of XK8ER
XK8ERFlag for United States of America

asked on

Hunting I/O Bottlenecks

hello there,
how can I check I/O Bottlenecks on my centos v5.6, I think I am having high server load due to HD I/O.
Avatar of wesly_chen
wesly_chen
Flag of United States of America image

run "top", and see CPU ... xx%wa.
XX%wait means XX perecentage of IO wait.

run lsof  to see the open file process and pid and you can see the which process open file at this moment.
Avatar of XK8ER

ASKER

on machine1 with high load this is what I get..

top - 16:36:44 up 13 days,  2:41,  1 user,  load average: 4.14, 4.84, 4.82
Tasks: 205 total,   1 running, 203 sleeping,   1 stopped,   0 zombie
Cpu(s): 76.8%us,  2.2%sy,  0.0%ni, 18.3%id,  2.3%wa,  0.1%hi,  0.2%si,  0.0%st
Mem:   8309564k total,  6824200k used,  1485364k free,   377432k buffers
Swap:  8193128k total,   214040k used,  7979088k free,  3251892k cached

on machine2 with normal load this is what I get..
top - 12:31:54 up 17 days, 22:49,  1 user,  load average: 0.00, 0.00, 0.00
Tasks: 158 total,   1 running, 156 sleeping,   1 stopped,   0 zombie
Cpu(s):  0.2%us,  0.0%sy,  0.0%ni, 99.8%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   3106040k total,  2864700k used,   241340k free,   292188k buffers
Swap:  5144568k total,      432k used,  5144136k free,  1783428k cached
> Cpu(s): 76.8%us,  2.2%sy,  0.0%ni, 18.3%id,  2.3%wa,  0.1%hi,  0.2%si,  0.0%st
2.3%wa  seems ok.
How about the output of
vmstat  5  5

and
sar  -d  | tail -5  

You need to have "sysstat" installed first for vmstat and sar.
Avatar of XK8ER

ASKER

this is what I get..

[(04:46 PM)][(root@alpha)] [(~)] $ vmstat  5  5
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 3  0 213568 1277244 387316 3455172    1    1    61    32    7    4 44  8 42  6  0
 3  0 213568 1252488 387324 3455812    0    0    51  2876 1533 2367 88  4  8  1  0
 3  0 213568 1250884 387328 3456236    0    0    61   368 1272 1745 84  2 13  1  0
 4  0 213568 1270580 387328 3456700    0    0    41  1146 1307 1664 46  2 50  2  0
 3  0 213568 1264152 387348 3457400    0    0    52  1607 1474 2244 55  3 37  5  0
[(05:02 PM)][(root@alpha)] [(~)] $ sar  -d  | tail -5
Requested activities not available in file
[(05:02 PM)][(root@alpha)] [(~)] $
The "wa" column in "vmstat 5 5" output show the 5% instantly IO wait.
It seems OK.

How about
lsof  | head -10
Make sure you have lsof installled. (yum install lsof)
Avatar of XK8ER

ASKER

[(05:09 PM)][(root@alpha)] [(~)] $ lsof  | head -10
COMMAND     PID      USER   FD      TYPE     DEVICE        SIZE       NODE NAME
init          1      root  cwd       DIR        9,1        4096          2 /
init          1      root  rtd       DIR        9,1        4096          2 /
init          1      root  txt       REG        9,1       38652   33456279 /sbin/init
init          1      root  mem       REG        9,1      129900   44892437 /lib/ld-2.5.so
init          1      root  mem       REG        9,1     1693812   44893011 /lib/libc-2.5.so
init          1      root  mem       REG        9,1       20668   44893047 /lib/libdl-2.5.so
init          1      root  mem       REG        9,1      245376   44893074 /lib/libsepol.so.1
init          1      root  mem       REG        9,1       93508   44893075 /lib/libselinux.so.1
init          1      root   10u     FIFO       0,17                   1303 /dev/initctl  
lsof |grep "9,1" | awk '{print $1" "$2}'| uniq -c | sort -nr
Avatar of XK8ER

ASKER

ok here
[(05:10 PM)][(root@alpha)] [(~)] $ lsof |grep "9,1" | awk '{print $1" "$2}'| uniq -c | sort -nr
    388 mysqld 22051
    160 httpd 4455
    160 httpd 4451
    160 httpd 4435
    160 httpd 31875
    159 httpd 4434
    159 httpd 4433
    159 httpd 3339
    159 httpd 2985
    159 httpd 2566
    159 httpd 2082
    159 httpd 1328
    159 httpd 1285
    159 httpd 1035
    158 httpd 6073
    158 httpd 6064
    158 httpd 6023
    158 httpd 5450
    158 httpd 5449
    158 httpd 5161
    158 httpd 5144
    158 httpd 4456
    158 httpd 4432
    158 httpd 4428
    158 httpd 4067
    158 httpd 3847
    158 httpd 3806
    158 httpd 2976
    158 httpd 27868
    158 httpd 27856
    158 httpd 1511
    158 httpd 1327
    125 python 6896
     91 yum-updat 4137
     79 php 6203
     77 php 6187
     57 searchd 5877
     53 spamd 6423
     53 spamd 6397
     53 spamd 3586
     49 eplwebdav 2378
     49 eplwebdav 2377
     49 eplwebdav 2376
     49 eplwebdav 2375
     49 eplwebdav 2374
     49 eplwebdav 2345
     48 sshd 6452
     48 sshd 6450
     48 sshd 6448
     48 sshd 6446
     48 sshd 6444
     48 sshd 31494
     48 eplhttpd 6916
     47 eplhttpd 6914
     45 python 3915
     41 MailScann 3293
     41 MailScann 29914
     41 MailScann 22868
     41 MailScann 22818
     41 MailScann 16626
     41 cupsd 3360
     38 sendmail 22152
     37 MailScann 3626
     34 sshd 6453
     34 sshd 6451
     34 sshd 6449
     34 sshd 6447
     34 sshd 6445
     34 sshd 3346
     32 sendmail 7632
     32 sendmail 3678
     32 sendmail 22162
     29 rpc.idmap 3027
     29 postmaste 3887
     29 postmaste 3884
     29 named 6644
     28 saslauthd 4027
     28 saslauthd 4026
     28 saslauthd 4025
     28 saslauthd 4024
     28 saslauthd 4023
     28 postmaste 3889
     28 postmaste 3888
     28 postmaste 3859
     28 crond 6180
     28 crond 1543
     24 bandwidth 3929
     19 automount 3227
     18 hald 4068
     16 clamd 3390
     15 syslogd 3249
     15 dbus-daem 3050
     15 crond 3943
     15 avahi-dae 4055
     15 avahi-dae 4054
     14 xinetd 3377
     14 exclog 27862
     14 atd 4006
     13 xfs 3981
     13 perl 3769
     13 hald-runn 4069
     12 auditd 2927
     11 smartd 4143
     11 gam_serve 4141
     11 bash 31496
     10 ulogd 3702
     10 udevd 576
     10 pcscd 3145
     10 mysqld_sa 21243
     10 lsof 6454
     10 hald-addo 4079
     10 hald-addo 4076
      9 lsof 6459
      9 iscsid 2192
      9 brcm_iscs 2182
      9 awk 6456
      8 update_vi 3755
      8 sh 6202
      8 run-parts 1548
      8 iscsid 2191
      8 irqbalanc 2960
      8 init 1
      8 hcid 3063
      8 grep 6455
      7 sort 6458
      7 sh 6185
      7 hidd 3183
      7 awk 3756
      7 acpid 3159
      6 uniq 6457
      6 sdpd 3069
      6 iostat 25242
      6 gpm 3766
      6 audispd 2929
      5 mingetty 4159
      5 mingetty 4152
      5 mingetty 4149
      5 mingetty 4148
      5 mingetty 4147
      5 mingetty 4146
      5 mdadm 2986
      5 klogd 3263
      2 watchdog/ 7
      2 watchdog/ 4
      2 watchdog/ 13
      2 watchdog/ 10
      2 scsi_eh_5 475
      2 scsi_eh_4 474
      2 scsi_eh_3 473
      2 scsi_eh_2 472
      2 scsi_eh_1 471
      2 scsi_eh_0 470
      2 rpciod/3 3020
      2 rpciod/2 3019
      2 rpciod/1 3018
      2 rpciod/0 3017
      2 rdma_cm 2162
      2 pdflush 22671
      2 pdflush 20738
      2 migration 8
      2 migration 5
      2 migration 2
      2 migration 11
      2 md1_raid1 514
      2 md0_raid1 517
      2 local_sa 2139
      2 kthread 19
      2 kswapd0 241
      2 kstriped 491
      2 ksoftirqd 9
      2 ksoftirqd 6
      2 ksoftirqd 3
      2 ksoftirqd 12
      2 kseriod 163
      2 krfcommd 3098
      2 kpsmoused 409
      2 kondemand 2013
      2 kondemand 2011
      2 kondemand 2010
      2 kondemand 2009
      2 kmpath_ha 1729
      2 kmpathd/3 1728
      2 kmpathd/2 1727
      2 kmpathd/1 1726
      2 kmpathd/0 1725
      2 kjournald 518
      2 kjournald 1760
      2 khungtask 238
      2 khubd 161
      2 khelper 18
      2 kedac 1113
      2 kblockd/3 28
      2 kblockd/2 27
      2 kblockd/1 26
      2 kblockd/0 25
      2 kauditd 543
      2 kacpid 29
      2 iw_cm_wq 2145
      2 iscsi_eh 2041
      2 ib_mcast 2137
      2 ib_inform 2138
      2 ib_cm/3 2155
      2 ib_cm/2 2154
      2 ib_cm/1 2153
      2 ib_cm/0 2152
      2 ib_addr 2120
      2 events/3 17
      2 events/2 16
      2 events/1 15
      2 events/0 14
      2 cqueue/3 158
      2 cqueue/2 157
      2 cqueue/1 156
      2 cqueue/0 155
      2 ata_aux 464
      2 ata/3 463
      2 ata/2 462
      2 ata/1 461
      2 ata/0 460
      2 aio/3 245
      2 aio/2 244
      2 aio/1 243
      2 aio/0 242
[(05:17 PM)][(root@alpha)] [(~)] $

Open in new window

OK, the chance are
mysqld (DB server)
and
httpd (Apache)

 have a lot of open files and it might imply that those two processes have a lot of disk IO.

MySQL is the most culprit.

If you have a lot of MySQL queries, then you need to add more memory to improve the performance.

There are some tuning trick for MySQL (but add memory is the most effective)
1. Turn off MySQL query log if it is enable in /etc/my.cnf (log=....)
   Leave only error logging (log-error=... )
2. set the
innodb_buffer_pool_size=   (70% of your physical memory size, so the more memory you have, the more buffer you can set and it reduce the disk IO)
3. mount your filesystem with "noatime,nodiratime" in /etc/fstab

The change for 1 and 2 need to restart mysql. and the third one need to reboot the system.

For Apache, turn off the logging (Common/LogCustomLog) in httpd.conf.
Leave only error log.
Avatar of XK8ER

ASKER

I have the innodb_buffer_pool_size set to 2GB because the system is 32bit and it can only use about 3.5GB out of the 8GB ram installed..
should I install 64bit instead?
How big is your DB size?
du -sk  /var/lib/mysql/* | sort -nr

If you database is bigger than 4GB, then it is better to use 64bit OS.
Besides, 64bit OS allocate memory more efficiently than 32bit.
Avatar of XK8ER

ASKER

it was about 18GB before but I switched to innodb barracuda format using COMPRESSED and its 8.8GB total now
>  innodb barracuda format using COMPRESSED
Compressed is on filesystem, not on memory. Besides, the compression is an extra CPU consumption and it is not recommended to use COMPRESSION for your database.

18GB DB szie, it is time to move to 64bit OS with 32GB memory (memory price is cheap nowaday).
It is interesting article.
However, there are two conditions for the compression to work magically.
1. The only table customer has on this server is one huge innodb table with a set of TEXT fields.
   Does your innodb database like this?
2. All reads from this table were pretty random (buffer pool didn’t help).
  Does your condition like it?

Please also read the last post comment.

Anyway, the database tuning varies from DB usage and type. So if it really help on your db, then keep it that way.
However, for 8.5GB db size, it is still highly recommended to use 64bit OS with more memory.
Avatar of XK8ER

ASKER

is this normal on a 32bit OS?

[(06:27 PM)][(root@alpha)] [(~)] $ free -om
             total       used       free     shared    buffers     cached
Mem:          8114       7217        896          0        395       3638
Swap:         8001        208       7792

showing as using 7GB of ram?

also I sent a ticket to the datacenter regarding the OS reinstall with centos v5.6 and 64bit instead..
they said take a look at this http://www.centos.org/modules/newbb/viewtopic.php?topic_id=8457
would that work?
It is probably use PAE kernel. run "uname -a" to check
However, PAE has extra step of higher memory mapping which is not efficient for memory allocating.

For database server with size more than 8GB, it is highly recommended that using 64bit OS with more memory.

Check this benchmark for 32bit, 32bit PAE and 64bit OS on the same hardware.
64bit OS is out performance than 32bit in Apache and all other area.
http://www.phoronix.com/scan.php?page=article&item=ubuntu_32_pae&num=1 
Avatar of XK8ER

ASKER

[(06:43 PM)][(root@alpha)] [(~)] $ uname -a
Linux alpha.site.net 2.6.18-238.19.1.el5PAE #1 SMP Fri Jul 15 08:15:44 EDT 2011 i686 i686 i386 GNU/Linux
[(06:43 PM)][(root@alpha)] [(~)] $

I dont think this server supports 32GB the max is 16GB if I remember..

but I dont want to spend time installing 16GB if im going to have the same issues?
ASKER CERTIFIED SOLUTION
Avatar of wesly_chen
wesly_chen
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial