Link to home
Start Free TrialLog in
Avatar of medent
medent

asked on

Need help identifying bottleneck(s)

I need help identifying the bottleneck(s) on a rh linux smp server. The server has 2x2.4 xeons with 12G ram.  The system load is over 50 most of the day, and at certain times cpu idle drops to zero and things noticably drag. The server runs several iterations (400-500) of our custom medical program.
Here is a page of "vmstat 3" output to get started:
 procs                      memory      swap          io     system      cpu
 r  b  w   swpd   free    buff      cache      si   so    bi    bo    in     cs      us sy id
23  0  1  13456  25040 200832 9269604   0    3   235   713  633  8226 19 62 19
 1  0  0  13480  25300 200844 9270584   0    8   331   173  624 10455 24 47 30
 9  0  0  13488  28392 200860 9270612   0    3    12   240  547  8453 13 41 46
141  0  1  13504  28976 200880 9270668 0    5    24   263  496 10598 15 27 59
23  0  0  13520  29032 200880 9270712   0    5    15     5  542  9200 21 64 15
142  0  1  13520  29692 200912 9270808  0    0    36   333  609 10152 16 40 44
24  1  0  13536  28024 200928 9271136   0    5   109   365  611 10452 24 49 28
35  0  0  13544  26976 200928 9271488   0    3   117     3  592  8272 20 71  9
160  0  0  13556  30392 200936 9272136  0    4   216   157  617 10410 26 72  2
38  0  0  13556  29604 200952 9272560    0    0   152     0  577  7923 21 76  2
49  0  0  13556  31708 200976 9272752    0    0    67   316  596 10350 25 57 19
24  0  0  13556  31840 200988 9273540    0    0   263   688  596  8123 17 65 18
29  1  0  13556  31672 200988 9273756    0    0    72     0  595 10512 21 43 36
53  0  0  13564  25988 200764 9275976    0    3   661   288  660  8257 21 67 12
23  0  0  13564  28140 200760 9275988    0    0     3     0  572 10397 20 59 21
33  0  0  13564  23468 200788 9276100    0    0    43   239  570  8104 21 67 13
24  0  1  13564  32396 200816 9262788    0    0    37   187  522 10350 26 53 21
63  0  0  13564  37076 200816 9263096    0    0   103     0  519  8483 18 66 16
19  1  0  13564  37716 200868 9264544    0    0   491   624  590 10279 22 62 16
 
Avatar of majorwoo
majorwoo

are these applications accessing the disk alot, and what kind of disc is it? Have you performed any optomization on the disc IO?

If there are hyperthreading Xeon's is hyperthreading turned on or off?
Avatar of medent

ASKER

1. Depending on how the app is used, the disk access will vary greatly. For example, I have another customer who actually has consistantly higher i/o stats, but far less system load numbers and good cpu stats ... the difference being the "good" site has less total number of processes running (same exact hardware).

2. The disc is hardware raid (5) setup on 5x36G 15000 rpm drives with an ibm 5i raid controller. I am running data=journal on the data filesystem, which I know is a lot more overhead but data security prevails.

3.  Yes, hyperthreading is on, I have not tried it off- but funny you mentioned I was just looking at some google threads regarding this...and wondering what would happen if I turned off...
when i had my Xeon's and 1TB attached to them I discovered better performance under Linux with hyperthreading disabled. -- however I was only running 4-5 of my

Is it safe to assume that you get more instances of your app when you hit 0 idle and performance suffers?
Avatar of medent

ASKER

Others seem to indicate that hyperthreading generally helps under heavy process loads, but actually hinders under light loads....?

The peak times (2pm-3pm) in the afternoon seems not to be a peak in sessions, but a peak in use of those already existing sessions. (Dr's offices busiest time).

Ps-  The iowait seems to be broke in top, so its hard to tell if processes are waiting on i/o.  The load factor is just way too high during normal use. I assume if teh cpu is showing some idle with high a load number- than its not the cpu causing the load number to be high? I am of course assuming the load number of 50 and above is in outer space. (anything more than twice the cpu count?)
I have heard similar things about hyperthreading, but I did keep hyperthreading disabled on the machine I used as our fileserver, overall response was quicker (if you look at the stats for you CPU's you will see that althouh each clains to have the processing power of a 2.0 xeon they can not perform at the level.)  I do however believe that given the number of processes you are running you will do better with hyper on - i think that hyper off may result in better speeds for the currently running process, but at the cost of overall lag to the system.

try pressing 1 inside of top to toggle multiprocessor mode, this will show you stats for each cpu (press w to save this config)
Avatar of medent

ASKER

I would like to identify the bottleneck using the stats I have now (the original question), a little more analysis before taking the risk of an experiment...  For example, I am  assuming if I have lots of cpu idle, but my load numbers are in orbit - then the bottle neck is elsewhere, probably disk... but not sure how to verify that. I think the load numbers are created based on certain process characteristics....?
I would assume the same thing -- have you performed any I/O calculations on the disk? As simple as timing a copy or used hdparm?

Also what version of redhat, and what kernel?
Avatar of medent

ASKER

Yes, I have my own i/o benchmark (copying+ zips)- and the server is compares ok to others of its class.
The rh base version is 7.3 with kernel 2.4.20-20.7 and glibc2.32
have you updated the file utils package? I remeber our redhat 7.3 servers doing poorly until we upgraded the kernel, file utils, and a few other packages related to file handling.
Avatar of medent

ASKER

Yes,  fileutils was updated, plus other dependencies
fileutils-4.1.9-11.i386.rpm  procps-2.0.13-1.i386.rpm
libacl-2.0.11-2.i386.rpm     sh-utils-2.0.12-3.i386.rpm
libattr-2.0.8-3.i386.rpm
with 12GB of ram you have certainly enabled highmem in the kernel correct?
Avatar of medent

ASKER

Yes up to 64G, and linux is seeing and using it.
ASKER CERTIFIED SOLUTION
Avatar of majorwoo
majorwoo

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial