suggestions to find centos 5.10 memory leak on ml350 G4

Neale2210
Neale2210 used Ask the Experts™
on
Identical hardware ie ml350 G4's.The one running centos 5.9 doesn't miss a beat, BUT the two running centos 5.10; two different machines have memory leaks.  Same software running on all 3 machines. Machines will respond to ping, but the applications stop running & you can't log on or vnc etc onto the servers.  It doesn't seem to make a difference whether you logoff of just lock the session
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
Gerwin Jansen, EE MVETopic Advisor
Most Valuable Expert 2016
Commented:
You could do some basic debugging to find out which process is using more memory.

A cron script that runs ps or top in batch mode every 5 minutes should give you an idea what is happening. Redirect output to a log file and analyze the log file after some time.

Example with top:
*/5 * * * * /bin/top -b -n 1 -o %MEM >> /tmp/toplog.txt

Open in new window

Seth SimmonsSr. Systems Administrator
Commented:
two different machines have memory leaks

what tells you that you have memory leaks?
just having processes not respond doesn't necessarily mean something is leaking memory; could be multiple possibilities

Commented:
Can you paste output of #free -m ?

and also please check which all processes are consuming most of the memory.. let us know the percentage...

TY/SA
Should you be charging more for IT Services?

Do you wonder if your IT business is truly profitable or if you should raise your prices? Learn how to calculate your overhead burden using our free interactive tool and use it to determine the right price for your IT services. Start calculating Now!

nociSoftware Engineer
Distinguished Expert 2018

Commented:
top, and sorting on M might help too

Author

Commented:
Sorry All, been busy, I double checked that the firmware was up to date on everything... raid controller was 2 updates old on one server, & memory swapped out... from the top
Gerwin... I'll try that idea
Seth... If your logs suddenly stop on the main program you're running ..either you've got a hardware fault.. or a program miss behaving.. after eliminating possible hardware issues..you then have to look at software...
Sandy...free -m wasn't that useful.. but I will post it, as you may be looking for something different
noci.... fair enough
Gerwin Jansen, EE MVETopic Advisor
Most Valuable Expert 2016

Commented:
@noci - my batch mode top commmand is sorting on M(emory) :)
nociSoftware Engineer
Distinguished Expert 2018

Commented:
does it show growth on the usage of any process? or Cache for that matter?
Gerwin Jansen, EE MVETopic Advisor
Most Valuable Expert 2016

Commented:
@noci - Growth: yes, any process: if run as root, yes,  cache: don't think so but top is just a start (other tools like vmstat will).

Author

Commented:
top -b -n 1 -o %MEM gives a usuage warning.... unknown argument 'o'

Author

Commented:
mem total:3546   used:1434    free:2111   shared:0    buffers:237   cached:764
-/+ buffers/cache:  432  3113   used/free

for 1 of the machines
Seth SimmonsSr. Systems Administrator

Commented:
what is the last line of your free output?  you only pasted 2 of 3 lines

Author

Commented:
swap: 9339  used 0 free 9339

Author

Commented:
for the 2nd machine mem total:3546  used:1522   free:2024   shared:0  buffers 218  cached:902
-/+ buffers/cache: 401 3145  used/free
swap: 4094  used:0  free:4094
Seth SimmonsSr. Systems Administrator

Commented:
ok...your swap partition isn't being used which is good
if there was a memory leak i would expect physical memory to be exhausted and swap space utilized but not seeing that
one place where i worked before we had an application that did file processing but if it came across a file of a certain size or data it didn't like it would leak memory until physical memoryand swap space was exhausted then the OOM killer would appear.  vendor acknowledged the issue and we put in a 2gb per-process limit on memory usage to mitigate the issue

doesn't seem to be the case here; not yet convinced of a memory leak
have you looked at syslog for anything that might help when this happens?
have you worked with the vendor at all for possible troubleshooting options for the application?

Author

Commented:
ok just seeing whether the firmware update to the raid controller made a difference, along with the change of memory.  The net difference in time before freezing was a gain of 3 days, probably attributable to the increase in memory.  So now I'm left with software, because I have an identical machine running centos 5.9, same program CommuniGate 6.0.5, with no freezing.

gerwin.. I still haven't had a correction to your 'top' command, to get mem stats.
Topic Advisor
Most Valuable Expert 2016
Commented:
I tested with another Linux distribution, sort on memory usage in CentOS(6):

top -b -n 1 -a

Can you try the above, if it works:
*/5 * * * * /bin/top -b -n 1 -a >> /tmp/toplog.txt

Open in new window

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial