Glauron
asked on
Some guidance on monitoring swap space usage?
G'day experts.
I need to have some insight into what may be hogging swap space, and/or some guidance on how to monitor swap usage properly over extended periods.
We're having issues with our server where periodically it will deliver Internal Server Error (500) pages, or other generic errors including, but not limited to, MySQL connection has gone away, PERL callback exit etc. (using PERL technology for most pages).
So far, I've noticed a correlation between these outages and swap space at the time. It's very hard to get proper stats because by the time I notice it's happening, unless I'm already in an SSH terminal, I can't run anything quick enough.
When I do get to run things such as:
swap -s
swal -l
vmstat
prstat
I notice free swap space has dropped to 125,000k or less from what would normally be close to 1G available.
When it drops "low", we get the errors, and when I try to run prstat, it tells me to "Please wait" for maybe up to 30 - normally it comes straight up.
I know a Java app in JBOSS was using loads of swap space before when (from memory) it was looking at disk usage and forked itself to do so which I believe spawned a new JBOSS instance (or something along those lines) which hogged around 800MB extra - but I understand we've fixed that since.
prstat -s size tells me:
23487 root 884M 640M sleep 59 0 133:01:27 4.0% java/94
3466 mysql 495M 157M sleep 59 0 0:48:11 0.4% mysqld/23
26920 root 237M 67M sleep 59 0 0:02:08 0.0% java/16
6318 daemon 124M 4268K sleep 59 0 0:00:00 0.0% httpd/1
... x 14
5144 root 123M 120M sleep 59 0 5:50:20 0.0% httpd/1
11480 root 92M 20M sleep 59 0 2:45:13 0.0% java/14
The java process (23487) at top using over 800mb is a JDK process "/jdk1.5.0_16/bin/java -D"? Not being a java developer, is that possible JBOSS using the dev kit & containing all it's connection instances etc?
There are a number of httpd instances running at 123MB each (which is probably why we get Internal Server errors when swap space drops below ~125MB), and then there are a couple slightly more lightweight java apps running.
Plus, a MySQL process taking ~ 500MB.
We're running Solaris 10. 8GB RAM, typically ~3.5GB swap allocated:
bash-3.00# swap -s
total: 3638492k bytes allocated + 2157388k reserved = 5795880k used, 1015240k available
With this info - what can I do to pinpoint the problem?? Does anyone know of any good utils to help me monitor & report on memory/swap usage? Or, any other ideas that can help?
Thanks, Glauron
I need to have some insight into what may be hogging swap space, and/or some guidance on how to monitor swap usage properly over extended periods.
We're having issues with our server where periodically it will deliver Internal Server Error (500) pages, or other generic errors including, but not limited to, MySQL connection has gone away, PERL callback exit etc. (using PERL technology for most pages).
So far, I've noticed a correlation between these outages and swap space at the time. It's very hard to get proper stats because by the time I notice it's happening, unless I'm already in an SSH terminal, I can't run anything quick enough.
When I do get to run things such as:
swap -s
swal -l
vmstat
prstat
I notice free swap space has dropped to 125,000k or less from what would normally be close to 1G available.
When it drops "low", we get the errors, and when I try to run prstat, it tells me to "Please wait" for maybe up to 30 - normally it comes straight up.
I know a Java app in JBOSS was using loads of swap space before when (from memory) it was looking at disk usage and forked itself to do so which I believe spawned a new JBOSS instance (or something along those lines) which hogged around 800MB extra - but I understand we've fixed that since.
prstat -s size tells me:
23487 root 884M 640M sleep 59 0 133:01:27 4.0% java/94
3466 mysql 495M 157M sleep 59 0 0:48:11 0.4% mysqld/23
26920 root 237M 67M sleep 59 0 0:02:08 0.0% java/16
6318 daemon 124M 4268K sleep 59 0 0:00:00 0.0% httpd/1
... x 14
5144 root 123M 120M sleep 59 0 5:50:20 0.0% httpd/1
11480 root 92M 20M sleep 59 0 2:45:13 0.0% java/14
The java process (23487) at top using over 800mb is a JDK process "/jdk1.5.0_16/bin/java -D"? Not being a java developer, is that possible JBOSS using the dev kit & containing all it's connection instances etc?
There are a number of httpd instances running at 123MB each (which is probably why we get Internal Server errors when swap space drops below ~125MB), and then there are a couple slightly more lightweight java apps running.
Plus, a MySQL process taking ~ 500MB.
We're running Solaris 10. 8GB RAM, typically ~3.5GB swap allocated:
bash-3.00# swap -s
total: 3638492k bytes allocated + 2157388k reserved = 5795880k used, 1015240k available
With this info - what can I do to pinpoint the problem?? Does anyone know of any good utils to help me monitor & report on memory/swap usage? Or, any other ideas that can help?
Thanks, Glauron
SOLUTION
membership
Create a free account to see this answer
Signing up is free and takes 30 seconds. No credit card required.
ASKER
Thanks.
Disk space is a bit tight, but I managed to grab 3GB from the home partition (not ideal, but it's there, unused and we're desperate!). I added it as a swap file & I'll let you know if things improve.
there wasn't much in the global zone /tmp directory, but I did clear a tiny bit from our primary zone /tmp.
Good to know.
I'd still like to know what's been eating swap though, since we've only been hit with this problem for a month or so (server has been in production for close to 2 years). We've run many apps and websites from it & made loads of updates, so it could be almost anything. I'm concerned that if something is leaking, it's only a matter of time before the newly allotted swap is used, and we're back to square one.
Thanks alot for the info on making sure there's enough swap free (I also rediscovered this page which is very helpful: http://softpanorama.org/Solaris/Processes_and_memory/swap_space_management.shtml) - but any ideas on how to track what is using swap? Is there a way to list swap usage along with process ID, then I can at least build a script of my own to monitor it?
Thanks again,
Glauron
Disk space is a bit tight, but I managed to grab 3GB from the home partition (not ideal, but it's there, unused and we're desperate!). I added it as a swap file & I'll let you know if things improve.
there wasn't much in the global zone /tmp directory, but I did clear a tiny bit from our primary zone /tmp.
Good to know.
I'd still like to know what's been eating swap though, since we've only been hit with this problem for a month or so (server has been in production for close to 2 years). We've run many apps and websites from it & made loads of updates, so it could be almost anything. I'm concerned that if something is leaking, it's only a matter of time before the newly allotted swap is used, and we're back to square one.
Thanks alot for the info on making sure there's enough swap free (I also rediscovered this page which is very helpful: http://softpanorama.org/Solaris/Processes_and_memory/swap_space_management.shtml) - but any ideas on how to track what is using swap? Is there a way to list swap usage along with process ID, then I can at least build a script of my own to monitor it?
Thanks again,
Glauron
ASKER CERTIFIED SOLUTION
membership
Create a free account to see this answer
Signing up is free and takes 30 seconds. No credit card required.
ASKER
Thanks guys.
Since adding another few GB of swap space, we haven't had a repeat of the problem. So it's definitely a swap space issue - and thanks for the insight into /tmp being mounted within swap and the potential problems associated with it! Learn something new every day. :)
I haven't moved /tmp away yet; that can be my next step if we run into the original problem again.
Since adding another few GB of swap space, we haven't had a repeat of the problem. So it's definitely a swap space issue - and thanks for the insight into /tmp being mounted within swap and the potential problems associated with it! Learn something new every day. :)
I haven't moved /tmp away yet; that can be my next step if we run into the original problem again.
The general rule of thumb has been swap=2xmemory so you box may be starved for swap space. and not leaving enough /tmp space.
If the available disk space is limited, you may want to see if you can carve out 1-2gb just for swap and change it to a physical mount point, instead of the tmpfs system tat is used by default. This solved a problem on one of my machines, that sounded just like the problem that you are having.
Good Luck!