Solved

Why system is swapping without reclaiming huge cached memory

Posted on 2014-01-07
10
519 Views
Last Modified: 2014-01-14
Hi,

I'm on a system that experiences swapping.
However, I have 4GB for 'cached' and I expected that the cache can be reclaimed under memory pressure.

top - 10:57:10 up 8 days,  2:49,  2 users,  load average: 0.18, 0.36, 0.46
Tasks: 645 total,   1 running, 644 sleeping,   0 stopped,   0 zombie
Cpu(s):  9.6%us,  8.9%sy,  0.0%ni, 81.0%id,  0.4%wa,  0.0%hi,  0.2%si,  0.0%st
Mem:   8126148k total,  7963148k used,   163000k free,     6912k buffers
Swap: 25165816k total,  6066084k used, 19099732k free,  4103596k cached

Open in new window

Do I interpret that correctly when I say that only 4GB of physical memory is used by the processes and the unused part serves as cache ?

Here is detail from sar (big swapping issue at 09:50)

sar -r
            kbmemfree kbmemused  %memused kbbuffers  kbcached  kbcommit   %commit
09:30:01 AM    134444   7991704     98.35     10332   4028216  19971452     59.99
09:40:01 AM    160784   7965364     98.02     10388   4048288  19818544     59.53
09:50:01 AM    128004   7998144     98.42      4868   4027960  19852052     59.63
10:00:01 AM    141856   7984292     98.25      6392   4151896  20006856     60.10
10:10:01 AM    195496   7930652     97.59     14352   4190944  19803992     59.49
10:20:04 AM    144424   7981724     98.22      4552   4256120  19811740     59.51
10:30:01 AM    155312   7970836     98.09      8248   4215552  19963548     59.97
Average:       193411   7932737     97.62     16036   3977477  19849017     59.62

sar -S
            kbswpfree kbswpused  %swpused  kbswpcad   %swpcad
09:30:01 AM  19083496   6082320     24.17    877932     14.43
09:40:01 AM  19099620   6066196     24.10    839472     13.84
09:50:01 AM  19130128   6035688     23.98    864112     14.32
10:00:01 AM  19206012   5959804     23.68    697464     11.70
10:10:01 AM  19228044   5937772     23.59    671576     11.31
10:20:04 AM  19245732   5920084     23.52    738748     12.48
10:30:01 AM  19188576   5977240     23.75    744344     12.45
Average:     18987125   6178691     24.55    934020     15.12

sar -B
             pgpgin/s pgpgout/s   fault/s  majflt/s  pgfree/s pgscank/s pgscand/s pgsteal/s    %vmeff
09:30:01 AM  11740.51    447.41  10604.60     15.65   3232.52    168.66     15.14    131.06     71.31
09:40:01 AM   7884.12   1225.61  10787.88     10.01   3223.09     87.22     18.81     83.75     78.99
09:50:01 AM   7657.48    475.55  10900.24     34.10   3402.93    309.62      9.57    243.68     76.34
10:00:01 AM   9923.26    677.34  11654.52     81.67   4279.32    835.99     17.22    632.46     74.13
10:10:01 AM   7943.10    399.23  10848.28     17.25   3200.36    108.03      9.29     88.78     75.66
10:20:04 AM  11178.82   1094.43  13837.84    146.95   5561.22   1372.39    123.54   1101.29     73.62
10:30:01 AM   8651.50    491.58  10718.29     16.86   3267.58    242.49     19.85    205.83     78.46
10:40:01 AM   8092.91    851.57  10575.16     19.10   3188.81    160.80     22.52    142.26     77.60
Average:     10435.52    622.72  10827.20     11.70   3233.49    102.95     24.56     96.19     75.44

Open in new window


But I'm not a specialist. What is the meaning of 'cached' and should I consider it as used or available memory ?

Thanks,
Franck.
0
Comment
Question by:Franck Pachot
  • 5
  • 2
  • 2
  • +1
10 Comments
 
LVL 37

Expert Comment

by:Gerwin Jansen
Comment Utility
HI, there are previously answered questions on this subject at EE, like this one:

http://www.experts-exchange.com/OS/Linux/Q_27967317.html

Does that answer your question as well?
0
 
LVL 15

Author Comment

by:Franck Pachot
Comment Utility
Hi,
Interresting link, but doesn't anwser. Let me rephrase my question.

1. From the 'top' result I've 8GB (8126148k total) of physical memory, and half of it (4103596k cached) is used for filesystem cache. is that right ?

2. I experience swapping because I use more memory than physical ones. in fact, there are several Oracle instances and the total of memory allocation is near 8GB
I expected that before swapping, the system should try to use the physical memory that is used for cache (i.e the 4GB). Am I missing something ?

Thanks,
Franck.
0
 
LVL 34

Accepted Solution

by:
Seth Simmons earned 500 total points
Comment Utility
1)  yes, that cached value is the file system cache used by applications and the operating system.  if the system needs more physical memory, the kernel will clear some of the file cache

2) clearly insufficient physical memory is causing the swap usage.  for things like oracle, you can adjust swappiness value which will cause the kernel to be more aggressive in using physical memory before the swap partition is touched.  you could do echo 0 > /proc/sys/vm/swappiness immediately (default value is 60) though at this point it's more effective by adding vm.swappiness = 0 to /etc/sysctl.conf and rebooting.  that setting is recommended for oracle anyway.   with the limited physical memory it will still use the swap partition, though probably not as soon
0
 
LVL 15

Author Comment

by:Franck Pachot
Comment Utility
Thanks a lot. I forgot to check swappiness . I will do it when I'm at the customer again next week.
just one additional precision, please: what is the best way to see - before the system starts swapping - if are low on physical memory.
I used to consider 'free' + 'buffers' + 'cached' but obviously, for the reason you gave me, it's not right. Monitoring page in/out will probably alert alert too late. Is there sonething else ?
0
 
LVL 34

Expert Comment

by:Seth Simmons
Comment Utility
free or top will show physical memory usage
0
How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

 
LVL 20

Expert Comment

by:tfewster
Comment Utility
I can't actually see a problem from the reports you posted - I don't see any change at 09:50, though I/O stats might show something if the system is really paging/swapping.

Oracle uses shared memory, so adding up the memory usage reported by `ps` will overestimate the amount actually used. You need to look within Oracle to see how it is using that memory - and if the system is mainly used for Oracle, consider increasing the SGA size so Oracle can manage its own caching.

Mem:   8126148k total,  7963148k used,   163000k free,     6912k buffers
Swap: 25165816k total,  6066084k used, 19099732k free,  4103596k cached

I believe the 6GB of swap "used" is reserved in case the system _needs_ to swap rather than actively in use. As you know,  the "cached" figure should be part of the previous line, Mem stats.

`sar -B` shows a peak at 10.20, but it could be a new process pulling data in.

Also, how do the stats compare to your "benchmarks"? e.g Mem usage after a reboot & starting Oracle, plus normal, low and high usage?
0
 
LVL 15

Author Comment

by:Franck Pachot
Comment Utility
@seth2740,
free or top will show physical memory usage
It show physical memory used. But part of it is used only because it is not needed for processes. How to know that part ?
Well I'll post the question again in a new thread.

@tfewster,
Thanks. Yes I'm aware that ps reports the shared memory for each process.
But I'm sure that SGA was above the physical memory and I'm sure the system was swapping at that time (system very long, kswapd busy) for at least 5 minutes around 09:50
What I did not understand was why it did not reclaim space from the filesystem cache.
0
 
LVL 20

Expert Comment

by:tfewster
Comment Utility
Aha, that's a different symptom. Search online for "kswapd high CPU". It's a known bug, and you should consider a kernel update.

I still suspect that, from the stats you posted, the system wasn't swapping and so didn't need to reclaim the memory used for cache.

Good luck!
0
 
LVL 15

Author Comment

by:Franck Pachot
Comment Utility
Sorry, but the system was swapping. When you are on a system that starts swapping a lot when you do something that needs to read a large portion of memory, you know that it is swapping. No doubt... Operation was an Oracle Statspack snapshot level 7 ... reading lot of things from shared pool.
My doubt was about why that memory was swapped.
The answer from seth2740 makes sense: that part of memory was probably idle for a long time (low activity on the instance before I arrived) and the system preferred to keep data in cache because of swapiness.

I'll update this thread on tuesday when I'll check swapiness value.
0
 
LVL 15

Author Comment

by:Franck Pachot
Comment Utility
Hi,
Jus a not to confirm that it was due to swappiness that was high:
$ cat /proc/sys/vm/swappiness
60

Open in new window

Thanks again seth2740
Regards,
Franck.
0

Featured Post

How to run any project with ease

Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
- Combine task lists, docs, spreadsheets, and chat in one
- View and edit from mobile/offline
- Cut down on emails

Join & Write a Comment

Network Interface Card (NIC) bonding, also known as link aggregation, NIC teaming and trunking, is an important concept to understand and implement in any environment where high availability is of concern. Using this feature, a server administrator …
Join Greg Farro and Ethan Banks from Packet Pushers (http://packetpushers.net/podcast/podcasts/pq-show-93-smart-network-monitoring-paessler-sponsored/) and Greg Ross from Paessler (https://www.paessler.com/prtg) for a discussion about smart network …
Connecting to an Amazon Linux EC2 Instance from Windows Using PuTTY.
This demo shows you how to set up the containerized NetScaler CPX with NetScaler Management and Analytics System in a non-routable Mesos/Marathon environment for use with Micro-Services applications.

771 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

15 Experts available now in Live!

Get 1:1 Help Now