medent
asked on
Need help identifying bottleneck(s)
I need help identifying the bottleneck(s) on a rh linux smp server. The server has 2x2.4 xeons with 12G ram. The system load is over 50 most of the day, and at certain times cpu idle drops to zero and things noticably drag. The server runs several iterations (400-500) of our custom medical program.
Here is a page of "vmstat 3" output to get started:
procs memory swap io system cpu
r b w swpd free buff cache si so bi bo in cs us sy id
23 0 1 13456 25040 200832 9269604 0 3 235 713 633 8226 19 62 19
1 0 0 13480 25300 200844 9270584 0 8 331 173 624 10455 24 47 30
9 0 0 13488 28392 200860 9270612 0 3 12 240 547 8453 13 41 46
141 0 1 13504 28976 200880 9270668 0 5 24 263 496 10598 15 27 59
23 0 0 13520 29032 200880 9270712 0 5 15 5 542 9200 21 64 15
142 0 1 13520 29692 200912 9270808 0 0 36 333 609 10152 16 40 44
24 1 0 13536 28024 200928 9271136 0 5 109 365 611 10452 24 49 28
35 0 0 13544 26976 200928 9271488 0 3 117 3 592 8272 20 71 9
160 0 0 13556 30392 200936 9272136 0 4 216 157 617 10410 26 72 2
38 0 0 13556 29604 200952 9272560 0 0 152 0 577 7923 21 76 2
49 0 0 13556 31708 200976 9272752 0 0 67 316 596 10350 25 57 19
24 0 0 13556 31840 200988 9273540 0 0 263 688 596 8123 17 65 18
29 1 0 13556 31672 200988 9273756 0 0 72 0 595 10512 21 43 36
53 0 0 13564 25988 200764 9275976 0 3 661 288 660 8257 21 67 12
23 0 0 13564 28140 200760 9275988 0 0 3 0 572 10397 20 59 21
33 0 0 13564 23468 200788 9276100 0 0 43 239 570 8104 21 67 13
24 0 1 13564 32396 200816 9262788 0 0 37 187 522 10350 26 53 21
63 0 0 13564 37076 200816 9263096 0 0 103 0 519 8483 18 66 16
19 1 0 13564 37716 200868 9264544 0 0 491 624 590 10279 22 62 16
Here is a page of "vmstat 3" output to get started:
procs memory swap io system cpu
r b w swpd free buff cache si so bi bo in cs us sy id
23 0 1 13456 25040 200832 9269604 0 3 235 713 633 8226 19 62 19
1 0 0 13480 25300 200844 9270584 0 8 331 173 624 10455 24 47 30
9 0 0 13488 28392 200860 9270612 0 3 12 240 547 8453 13 41 46
141 0 1 13504 28976 200880 9270668 0 5 24 263 496 10598 15 27 59
23 0 0 13520 29032 200880 9270712 0 5 15 5 542 9200 21 64 15
142 0 1 13520 29692 200912 9270808 0 0 36 333 609 10152 16 40 44
24 1 0 13536 28024 200928 9271136 0 5 109 365 611 10452 24 49 28
35 0 0 13544 26976 200928 9271488 0 3 117 3 592 8272 20 71 9
160 0 0 13556 30392 200936 9272136 0 4 216 157 617 10410 26 72 2
38 0 0 13556 29604 200952 9272560 0 0 152 0 577 7923 21 76 2
49 0 0 13556 31708 200976 9272752 0 0 67 316 596 10350 25 57 19
24 0 0 13556 31840 200988 9273540 0 0 263 688 596 8123 17 65 18
29 1 0 13556 31672 200988 9273756 0 0 72 0 595 10512 21 43 36
53 0 0 13564 25988 200764 9275976 0 3 661 288 660 8257 21 67 12
23 0 0 13564 28140 200760 9275988 0 0 3 0 572 10397 20 59 21
33 0 0 13564 23468 200788 9276100 0 0 43 239 570 8104 21 67 13
24 0 1 13564 32396 200816 9262788 0 0 37 187 522 10350 26 53 21
63 0 0 13564 37076 200816 9263096 0 0 103 0 519 8483 18 66 16
19 1 0 13564 37716 200868 9264544 0 0 491 624 590 10279 22 62 16
ASKER
1. Depending on how the app is used, the disk access will vary greatly. For example, I have another customer who actually has consistantly higher i/o stats, but far less system load numbers and good cpu stats ... the difference being the "good" site has less total number of processes running (same exact hardware).
2. The disc is hardware raid (5) setup on 5x36G 15000 rpm drives with an ibm 5i raid controller. I am running data=journal on the data filesystem, which I know is a lot more overhead but data security prevails.
3. Yes, hyperthreading is on, I have not tried it off- but funny you mentioned I was just looking at some google threads regarding this...and wondering what would happen if I turned off...
2. The disc is hardware raid (5) setup on 5x36G 15000 rpm drives with an ibm 5i raid controller. I am running data=journal on the data filesystem, which I know is a lot more overhead but data security prevails.
3. Yes, hyperthreading is on, I have not tried it off- but funny you mentioned I was just looking at some google threads regarding this...and wondering what would happen if I turned off...
when i had my Xeon's and 1TB attached to them I discovered better performance under Linux with hyperthreading disabled. -- however I was only running 4-5 of my
Is it safe to assume that you get more instances of your app when you hit 0 idle and performance suffers?
Is it safe to assume that you get more instances of your app when you hit 0 idle and performance suffers?
ASKER
Others seem to indicate that hyperthreading generally helps under heavy process loads, but actually hinders under light loads....?
The peak times (2pm-3pm) in the afternoon seems not to be a peak in sessions, but a peak in use of those already existing sessions. (Dr's offices busiest time).
Ps- The iowait seems to be broke in top, so its hard to tell if processes are waiting on i/o. The load factor is just way too high during normal use. I assume if teh cpu is showing some idle with high a load number- than its not the cpu causing the load number to be high? I am of course assuming the load number of 50 and above is in outer space. (anything more than twice the cpu count?)
The peak times (2pm-3pm) in the afternoon seems not to be a peak in sessions, but a peak in use of those already existing sessions. (Dr's offices busiest time).
Ps- The iowait seems to be broke in top, so its hard to tell if processes are waiting on i/o. The load factor is just way too high during normal use. I assume if teh cpu is showing some idle with high a load number- than its not the cpu causing the load number to be high? I am of course assuming the load number of 50 and above is in outer space. (anything more than twice the cpu count?)
I have heard similar things about hyperthreading, but I did keep hyperthreading disabled on the machine I used as our fileserver, overall response was quicker (if you look at the stats for you CPU's you will see that althouh each clains to have the processing power of a 2.0 xeon they can not perform at the level.) I do however believe that given the number of processes you are running you will do better with hyper on - i think that hyper off may result in better speeds for the currently running process, but at the cost of overall lag to the system.
try pressing 1 inside of top to toggle multiprocessor mode, this will show you stats for each cpu (press w to save this config)
try pressing 1 inside of top to toggle multiprocessor mode, this will show you stats for each cpu (press w to save this config)
ASKER
I would like to identify the bottleneck using the stats I have now (the original question), a little more analysis before taking the risk of an experiment... For example, I am assuming if I have lots of cpu idle, but my load numbers are in orbit - then the bottle neck is elsewhere, probably disk... but not sure how to verify that. I think the load numbers are created based on certain process characteristics....?
I would assume the same thing -- have you performed any I/O calculations on the disk? As simple as timing a copy or used hdparm?
Also what version of redhat, and what kernel?
Also what version of redhat, and what kernel?
ASKER
Yes, I have my own i/o benchmark (copying+ zips)- and the server is compares ok to others of its class.
The rh base version is 7.3 with kernel 2.4.20-20.7 and glibc2.32
The rh base version is 7.3 with kernel 2.4.20-20.7 and glibc2.32
have you updated the file utils package? I remeber our redhat 7.3 servers doing poorly until we upgraded the kernel, file utils, and a few other packages related to file handling.
ASKER
Yes, fileutils was updated, plus other dependencies
fileutils-4.1.9-11.i386.rp m procps-2.0.13-1.i386.rpm
libacl-2.0.11-2.i386.rpm sh-utils-2.0.12-3.i386.rpm
libattr-2.0.8-3.i386.rpm
fileutils-4.1.9-11.i386.rp
libacl-2.0.11-2.i386.rpm sh-utils-2.0.12-3.i386.rpm
libattr-2.0.8-3.i386.rpm
with 12GB of ram you have certainly enabled highmem in the kernel correct?
ASKER
Yes up to 64G, and linux is seeing and using it.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
If there are hyperthreading Xeon's is hyperthreading turned on or off?