G'day experts.
I need to have some insight into what may be hogging swap space, and/or some guidance on how to monitor swap usage properly over extended periods.
We're having issues with our server where periodically it will deliver Internal Server Error (500) pages, or other generic errors including, but not limited to, MySQL connection has gone away, PERL callback exit etc. (using PERL technology for most pages).
So far, I've noticed a correlation between these outages and swap space at the time. It's very hard to get proper stats because by the time I notice it's happening, unless I'm already in an SSH terminal, I can't run anything quick enough.
When I do get to run things such as:
swap -s
swal -l
vmstat
prstat
I notice free swap space has dropped to 125,000k or less from what would normally be close to 1G available.
When it drops "low", we get the errors, and when I try to run prstat, it tells me to "Please wait" for maybe up to 30 - normally it comes straight up.
I know a Java app in JBOSS was using loads of swap space before when (from memory) it was looking at disk usage and forked itself to do so which I believe spawned a new JBOSS instance (or something along those lines) which hogged around 800MB extra - but I understand we've fixed that since.
prstat -s size tells me:
23487 root 884M 640M sleep 59 0 133:01:27 4.0% java/94
3466 mysql 495M 157M sleep 59 0 0:48:11 0.4% mysqld/23
26920 root 237M 67M sleep 59 0 0:02:08 0.0% java/16
6318 daemon 124M 4268K sleep 59 0 0:00:00 0.0% httpd/1
... x 14
5144 root 123M 120M sleep 59 0 5:50:20 0.0% httpd/1
11480 root 92M 20M sleep 59 0 2:45:13 0.0% java/14
The java process (23487) at top using over 800mb is a JDK process "/jdk1.5.0_16/bin/java -D"? Not being a java developer, is that possible JBOSS using the dev kit & containing all it's connection instances etc?
There are a number of httpd instances running at 123MB each (which is probably why we get Internal Server errors when swap space drops below ~125MB), and then there are a couple slightly more lightweight java apps running.
Plus, a MySQL process taking ~ 500MB.
We're running Solaris 10. 8GB RAM, typically ~3.5GB swap allocated:
bash-3.00# swap -s
total: 3638492k bytes allocated + 2157388k reserved = 5795880k used, 1015240k available
With this info - what can I do to pinpoint the problem?? Does anyone know of any good utils to help me monitor & report on memory/swap usage? Or, any other ideas that can help?
Thanks, Glauron
The general rule of thumb has been swap=2xmemory so you box may be starved for swap space. and not leaving enough /tmp space.
If the available disk space is limited, you may want to see if you can carve out 1-2gb just for swap and change it to a physical mount point, instead of the tmpfs system tat is used by default. This solved a problem on one of my machines, that sounded just like the problem that you are having.
Good Luck!