Web Server Memory high consumption

We have any apache web server with 16 GB RAM , memory consumed is around 14 GB, we only have apache running on it. It is using around 400 Mb for thread.

I dont see any other service talking memory. Bt even on restarting apache I still see > 12 Gb memory usage.

I had run top command & seen if there is any other process but, nothing is there.

What should be eating my memory..I only get the memory to normal after restarting my server
sivaatluriAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

arnoldCommented:
Check config dealing with how many child process it starts, php? MySQL? It looks as though your config starts 30 httpd child process.
Tomcat?
Memory consumption is one thing, is the system performing answering requests, or is it sluggish and unable to respond?
Seth SimmonsSr. Systems AdministratorCommented:
have you looked at top?  sort by res column
sivaatluriAuthor Commented:
Arnold,

Only PHP on this server, Mysql is on different server. If Apache or PHP using all the memory it should get recovered when I restart httpd service but only 1 GB is gettng recovered of 12 GB size.

Seth,

I had run top 10 process using memory & only httpd came out & no other process
Determine the Perfect Price for Your IT Services

Do you wonder if your IT business is truly profitable or if you should raise your prices? Learn how to calculate your overhead burden with our free interactive tool and use it to determine the right price for your IT services. Download your free eBook now!

arnoldCommented:
You could use lsof to find what consumes your system memory.  Do you have a reverse proxy?
what distro are you Running? Run Level?
Do you have tomcat also running?

You are providing no information on which any guesstimate can be made?
If you stop httpd, does the memory use drops?

Restarting presumes that your system experiences memory leak which will be freed up and will build back up overtime, but if your setup httpd.conf is such that it spawns many httpd clients, the restart merely cleared the sessions the memory each client consumes will be reserved as soon as they are re spawned.

ps -ef |grep httpd |wc -l
How many lines is reported, you can subtract one to account for the grep.

Often default is 5-10 clients.
What is the server's setup? Are there multiple websites being served? Are they being served from a single instance (one IP named hosts) or the system has multiple IPs with multiple httpd instances with each having 5-10 clients?
arnoldCommented:
Post top -n 1
gheistCommented:
Can you post output of:
free
vmstat 5 5
iostat 5 5
pstree
?
sivaatluriAuthor Commented:
gheist Below is the infor requested

screenshots.png
sivaatluriAuthor Commented:
gheist,

Below is the pstree output
Screenshot-1.png
sivaatluriAuthor Commented:
arnold,

Here is the Ps comannd output with wc httpd
Screenshot-2.png
sivaatluriAuthor Commented:
Below are some more screenshots after restart the memory is recovered
Screenshot-4.png
Screenshot-6.png
Screenshot-3.png
Screenshot-4.png
After-reboot.png
gheistCommented:
Output of vmstat/iostat is cut. Please paste full text.
Is it possible to enable 1-2GB swap file to see if you tend to exceed RAM in system?

Basically most of time system spends idle
Can you check:
netstat -anp | grep ESTA | grep -v "0      0"
i.e network sockets that collect data to be sent or received because some program cannot cope with it?
sivaatluriAuthor Commented:
Adding SWAP is not an issue, just I would like to know what is consuming memory

 netstat -anp | grep ESTA | grep -v "0      0"  => Got no output

There is nothing below the VMStat & IOSTAT it is complete output pasted over there.
sivaatluriAuthor Commented:
During the same time I see some thing like this in vi /proc/meminfo, does that mean I have 17 G memory free, if so why free -m is showing just 2 GB

MemTotal:       17557912 kB
MemFree:          478928 kB
MemAvailable:   17159508 kB
Buffers:          193556 kB
Cached:          1547128 kB
SwapCached:            0 kB
Active:          1195464 kB
Inactive:         712612 kB
Active(anon):     167444 kB
Inactive(anon):      208 kB
Active(file):    1028020 kB
Inactive(file):   712404 kB
Unevictable:           0 kB
Mlocked:               0 kB
SwapTotal:             0 kB
SwapFree:              0 kB
Dirty:                48 kB
Writeback:             0 kB
AnonPages:        167396 kB
Mapped:            16916 kB
Shmem:               260 kB
Slab:           15019224 kB
SReclaimable:   15002940 kB
SUnreclaim:        16284 kB
KernelStack:         880 kB
PageTables:        19116 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:     8778956 kB
Committed_AS:     611592 kB
VmallocTotal:   34359738367 kB
VmallocUsed:       54788 kB
VmallocChunk:   34359680943 kB
AnonHugePages:         0 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
DirectMap4k:    17928192 kB
DirectMap2M:           0 kB
arnoldCommented:
Could you please post the output from top -n1
Does your system have cGIs, does your system send out mass mails or hand,es incoming email.

Try restarting sendmail to see if ememory is recovered.
arnoldCommented:
You seem to be inclined to provide what you think we need versus what it is we ask.

If you stop/restart apache and the memory is still consumed. it means that apache is not the one consuming it.
you have rsyslog running, do you have any scripts/processes there to generate additional information?
you have sendmail running on the system, is it used to accept incoming email, check the queue on sendmail to make sure that it is not the one consuming the memory.
do you have tomcat running on the system?

The netstat command gheist asked you to run, should reflect all the connection that are established to your system, and you say there are no connections.

Perhaps what you think consumes your system's memory is not the web/apache, but something else.
gheistCommented:
You have memory leak in SLABs (kernel allocations) - whole grand 15GB eaten by kernel
In addition to what was asked already please help us with /proc/slabinfo
i.e netstat with full sockets, vmstat, iostat
sivaatluriAuthor Commented:
Arnold,

 Seems there is some miscommunication. I agree that after restarting Apache memory should get recovered which is not happening. I can tell you that major services running on this server is Apache & it is serving PHP Content.

Yes, Sendmail service is turned on which is turned on by default & we have not modified it, Sendmail service has no functionality on our Server.

Tomcat is not running on my system. I'm looking to find out why the memory is being consumed, Im attaching the outputs of vmstat & iostat. Please ask if there is some thing else need to identify issue

--------------------------------------------------------------------------------------------------------------------------

Gheist, Here is the requested


[root@ip-10-0-53-10 ~]# cat /proc/slabinfo
slabinfo - version: 2.1
# name            <active_objs> <num_objs> <objsize> <objperslab> <pagesperslab> : tunables <limit> <batchcount> <sharedfactor> : slabdata <active_slabs> <num_slabs> <sharedavail>
UDPLITEv6              0      0   1088   30    8 : tunables    0    0    0 : slabdata      0      0      0
UDPv6                240    240   1088   30    8 : tunables    0    0    0 : slabdata      8      8      0
tw_sock_TCPv6        576    576    256   32    2 : tunables    0    0    0 : slabdata     18     18      0
TCPv6                673    720   1984   16    8 : tunables    0    0    0 : slabdata     45     45      0
ext4_inode_cache  268218 269115    968   33    8 : tunables    0    0    0 : slabdata   8155   8155      0
ext4_xattr             0      0     88   46    1 : tunables    0    0    0 : slabdata      0      0      0
ext4_free_data      6336   6336     64   64    1 : tunables    0    0    0 : slabdata     99     99      0
ext4_allocation_context   1590   1590    136   30    1 : tunables    0    0    0 : slabdata     53     53      0
ext4_io_end         1064   1064     72   56    1 : tunables    0    0    0 : slabdata     19     19      0
ext4_extent_status 116608 125664     40  102    1 : tunables    0    0    0 : slabdata   1232   1232      0
jbd2_journal_handle    680    680     48   85    1 : tunables    0    0    0 : slabdata      8      8      0
jbd2_journal_head   2306   2376    112   36    1 : tunables    0    0    0 : slabdata     66     66      0
jbd2_revoke_table_s    256    256     16  256    1 : tunables    0    0    0 : slabdata      1      1      0
jbd2_revoke_record_s   2176   2176     32  128    1 : tunables    0    0    0 : slabdata     17     17      0
kcopyd_job             0      0   3312    9    8 : tunables    0    0    0 : slabdata      0      0      0
dm_uevent              0      0   2608   12    8 : tunables    0    0    0 : slabdata      0      0      0
dm_rq_target_io        0      0    424   38    4 : tunables    0    0    0 : slabdata      0      0      0
dm_io                  0      0     40  102    1 : tunables    0    0    0 : slabdata      0      0      0
flow_cache             0      0    104   39    1 : tunables    0    0    0 : slabdata      0      0      0
bsg_cmd                0      0    312   26    2 : tunables    0    0    0 : slabdata      0      0      0
mqueue_inode_cache     36     36    896   36    8 : tunables    0    0    0 : slabdata      1      1      0
hugetlbfs_inode_cache     28     28    584   28    4 : tunables    0    0    0 : slabdata      1      1      0
dquot                512    512    256   32    2 : tunables    0    0    0 : slabdata     16     16      0
pid_namespace          0      0   2192   14    8 : tunables    0    0    0 : slabdata      0      0      0
user_namespace         0      0    232   35    2 : tunables    0    0    0 : slabdata      0      0      0
posix_timers_cache      0      0    248   33    2 : tunables    0    0    0 : slabdata      0      0      0
UDP-Lite               0      0    896   36    8 : tunables    0    0    0 : slabdata      0      0      0
ip_fib_trie          292    292     56   73    1 : tunables    0    0    0 : slabdata      4      4      0
UDP                  288    288    896   36    8 : tunables    0    0    0 : slabdata      8      8      0
tw_sock_TCP          288    288    256   32    2 : tunables    0    0    0 : slabdata      9      9      0
TCP                  252    252   1792   18    8 : tunables    0    0    0 : slabdata     14     14      0
eventpoll_pwq       3092   3248     72   56    1 : tunables    0    0    0 : slabdata     58     58      0
blkdev_queue          34     34   1880   17    8 : tunables    0    0    0 : slabdata      2      2      0
blkdev_requests      252    378    384   21    2 : tunables    0    0    0 : slabdata     18     18      0
sock_inode_cache    1275   1275    640   25    4 : tunables    0    0    0 : slabdata     51     51      0
net_namespace          0      0   4352    7    8 : tunables    0    0    0 : slabdata      0      0      0
shmem_inode_cache    648    648    656   24    4 : tunables    0    0    0 : slabdata     27     27      0
ftrace_event_file   1012   1012     88   46    1 : tunables    0    0    0 : slabdata     22     22      0
task_delay_info     1692   1692    112   36    1 : tunables    0    0    0 : slabdata     47     47      0
taskstats           1872   1872    328   24    2 : tunables    0    0    0 : slabdata     78     78      0
proc_inode_cache    6009   6300    632   25    4 : tunables    0    0    0 : slabdata    252    252      0
sigqueue             300    300    160   25    1 : tunables    0    0    0 : slabdata     12     12      0
bdev_cache           117    117    832   39    8 : tunables    0    0    0 : slabdata      3      3      0
kernfs_node_cache   9350   9350    120   34    1 : tunables    0    0    0 : slabdata    275    275      0
mnt_cache            100    100    320   25    2 : tunables    0    0    0 : slabdata      4      4      0
inode_cache         8456   8456    568   28    4 : tunables    0    0    0 : slabdata    302    302      0
dentry            136115931 136115931    192   21    1 : tunables    0    0    0 : slabdata 6481711 6481711      0
buffer_head       533833 533871    104   39    1 : tunables    0    0    0 : slabdata  13689  13689      0
vm_area_struct     19428  19536    184   22    1 : tunables    0    0    0 : slabdata    888    888      0
mm_struct           1620   1620    896   36    8 : tunables    0    0    0 : slabdata     45     45      0
files_cache          979   1100    640   25    4 : tunables    0    0    0 : slabdata     44     44      0
signal_cache         907    990   1088   30    8 : tunables    0    0    0 : slabdata     33     33      0
sighand_cache        540    615   2112   15    8 : tunables    0    0    0 : slabdata     41     41      0
task_struct          215    250   6480    5    8 : tunables    0    0    0 : slabdata     50     50      0
anon_vma           12995  13632     64   64    1 : tunables    0    0    0 : slabdata    213    213      0
shared_policy_node  19210  19210     48   85    1 : tunables    0    0    0 : slabdata    226    226      0
numa_policy           31     31    264   31    2 : tunables    0    0    0 : slabdata      1      1      0
radix_tree_node    48255  48524    568   28    4 : tunables    0    0    0 : slabdata   1733   1733      0
idr_layer_cache      225    225   2112   15    8 : tunables    0    0    0 : slabdata     15     15      0
dma-kmalloc-8192       0      0   8192    4    8 : tunables    0    0    0 : slabdata      0      0      0
dma-kmalloc-4096       0      0   4096    8    8 : tunables    0    0    0 : slabdata      0      0      0
dma-kmalloc-2048       0      0   2048   16    8 : tunables    0    0    0 : slabdata      0      0      0
dma-kmalloc-1024       0      0   1024   32    8 : tunables    0    0    0 : slabdata      0      0      0
dma-kmalloc-512        0      0    512   32    4 : tunables    0    0    0 : slabdata      0      0      0
dma-kmalloc-256        0      0    256   32    2 : tunables    0    0    0 : slabdata      0      0      0
dma-kmalloc-128        0      0    128   32    1 : tunables    0    0    0 : slabdata      0      0      0
dma-kmalloc-64         0      0     64   64    1 : tunables    0    0    0 : slabdata      0      0      0
dma-kmalloc-32         0      0     32  128    1 : tunables    0    0    0 : slabdata      0      0      0
dma-kmalloc-16         0      0     16  256    1 : tunables    0    0    0 : slabdata      0      0      0
dma-kmalloc-8          0      0      8  512    1 : tunables    0    0    0 : slabdata      0      0      0
dma-kmalloc-192        0      0    192   21    1 : tunables    0    0    0 : slabdata      0      0      0
dma-kmalloc-96         0      0     96   42    1 : tunables    0    0    0 : slabdata      0      0      0
kmalloc-8192          44     44   8192    4    8 : tunables    0    0    0 : slabdata     11     11      0
kmalloc-4096         128    128   4096    8    8 : tunables    0    0    0 : slabdata     16     16      0
kmalloc-2048         274    352   2048   16    8 : tunables    0    0    0 : slabdata     22     22      0
kmalloc-1024         922   1024   1024   32    8 : tunables    0    0    0 : slabdata     32     32      0
kmalloc-512         1312   1312    512   32    4 : tunables    0    0    0 : slabdata     41     41      0
kmalloc-256         2650   2976    256   32    2 : tunables    0    0    0 : slabdata     93     93      0
kmalloc-192         3765   6111    192   21    1 : tunables    0    0    0 : slabdata    291    291      0
kmalloc-128         4384   4384    128   32    1 : tunables    0    0    0 : slabdata    137    137      0
kmalloc-96          5544   5544     96   42    1 : tunables    0    0    0 : slabdata    132    132      0
kmalloc-64         71970  95872     64   64    1 : tunables    0    0    0 : slabdata   1498   1498      0
kmalloc-32          3968   3968     32  128    1 : tunables    0    0    0 : slabdata     31     31      0
kmalloc-16          3840   3840     16  256    1 : tunables    0    0    0 : slabdata     15     15      0
kmalloc-8           5120   5120      8  512    1 : tunables    0    0    0 : slabdata     10     10      0
kmem_cache_node      256    256     64   64    1 : tunables    0    0    0 : slabdata      4      4      0
kmem_cache           160    160    256   32    2 : tunables    0    0    0 : slabdata      5      5      0


------------------------------------------------------------------------------------------------
Vmstat-Command-outputs.txt
iostat.txt
netstat--s-output.txt
netstat--a-output.txt
netstat--pt-output.txt
sivaatluriAuthor Commented:
Here is the output of top -n1

[root@ip-10-0-53-10 ~]# top -n1
top - 06:25:44 up 7 days, 14:41,  1 user,  load average: 0.01, 0.06, 0.05
Tasks: 143 total,   1 running, 141 sleeping,   1 stopped,   0 zombie
Cpu(s):  0.9%us,  0.3%sy,  0.0%ni, 98.7%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:  30828108k total, 30538552k used,   289556k free,   257320k buffers
Swap:        0k total,        0k used,        0k free,  3480532k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 9913 root      20   0 15248 1232  904 R  2.0  0.0   0:00.01 top
    1 root      20   0 19596 1604 1296 S  0.0  0.0   0:01.89 init
    2 root      20   0     0    0    0 S  0.0  0.0   0:00.00 kthreadd
    3 root      20   0     0    0    0 S  0.0  0.0   0:41.72 ksoftirqd/0
    5 root       0 -20     0    0    0 S  0.0  0.0   0:00.00 kworker/0:0H
    6 root      20   0     0    0    0 S  0.0  0.0   0:00.01 kworker/u16:0
    7 root      20   0     0    0    0 S  0.0  0.0   2:08.96 rcu_sched
    8 root      20   0     0    0    0 S  0.0  0.0   0:00.00 rcu_bh
    9 root      RT   0     0    0    0 S  0.0  0.0   0:00.94 migration/0
   10 root      RT   0     0    0    0 S  0.0  0.0   0:00.56 migration/1
   11 root      20   0     0    0    0 S  0.0  0.0   0:03.01 ksoftirqd/1
   12 root      20   0     0    0    0 S  0.0  0.0   0:00.00 kworker/1:0
   13 root       0 -20     0    0    0 S  0.0  0.0   0:00.00 kworker/1:0H
   14 root      RT   0     0    0    0 S  0.0  0.0   0:00.17 migration/2
   15 root      20   0     0    0    0 S  0.0  0.0   0:00.74 ksoftirqd/2
   17 root       0 -20     0    0    0 S  0.0  0.0   0:00.00 kworker/2:0H
   18 root      RT   0     0    0    0 S  0.0  0.0   0:00.12 migration/3
   19 root      20   0     0    0    0 S  0.0  0.0   0:00.38 ksoftirqd/3
   21 root       0 -20     0    0    0 S  0.0  0.0   0:00.00 kworker/3:0H
   22 root      RT   0     0    0    0 S  0.0  0.0   0:00.04 migration/4
   23 root      20   0     0    0    0 S  0.0  0.0   0:00.23 ksoftirqd/4
   25 root       0 -20     0    0    0 S  0.0  0.0   0:00.00 kworker/4:0H
   26 root      RT   0     0    0    0 S  0.0  0.0   0:00.05 migration/5
   27 root      20   0     0    0    0 S  0.0  0.0   0:00.18 ksoftirqd/5
   29 root       0 -20     0    0    0 S  0.0  0.0   0:00.00 kworker/5:0H
   30 root      RT   0     0    0    0 S  0.0  0.0   0:00.10 migration/6
   31 root      20   0     0    0    0 S  0.0  0.0   0:00.30 ksoftirqd/6
   33 root       0 -20     0    0    0 S  0.0  0.0   0:00.00 kworker/6:0H
   34 root      RT   0     0    0    0 S  0.0  0.0   0:00.08 migration/7
   35 root      20   0     0    0    0 S  0.0  0.0   0:00.21 ksoftirqd/7
   37 root       0 -20     0    0    0 S  0.0  0.0   0:00.00 kworker/7:0H
   38 root       0 -20     0    0    0 S  0.0  0.0   0:00.00 khelper
   39 root      20   0     0    0    0 S  0.0  0.0   0:00.00 kdevtmpfs
   40 root       0 -20     0    0    0 S  0.0  0.0   0:00.00 netns
   41 root      20   0     0    0    0 S  0.0  0.0   0:21.36 kworker/u16:1
   47 root      20   0     0    0    0 S  0.0  0.0   0:00.00 xenwatch
   48 root      20   0     0    0    0 S  0.0  0.0   0:00.00 xenbus
arnoldCommented:
Are you running the system in graphical mode?
Runlevel

Is this a VM under XEn or you are running XEn?
arnoldCommented:
Run top, and look at ordering based on virtual/memory to see what consumes it.

Look at the services you have running that might not need
sivaatluriAuthor Commented:
This is an amazon Ec2 Instance, no graphical mode. Below is top command output

top - 06:58:38 up 52 min,  1 user,  load average: 0.14, 0.05, 0.05
Tasks: 116 total,   1 running, 115 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.4%us,  0.2%sy,  0.0%ni, 99.3%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:  15407768k total,  2170528k used, 13237240k free,    28228k buffers
Swap:        0k total,        0k used,        0k free,   116692k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 2021 apache    20   0  397m  17m 3932 S  1.0  0.1   0:00.07 httpd
 2152 apache    20   0  396m  16m 3856 S  1.0  0.1   0:00.03 httpd
 1457 root      20   0  391m  14m 7020 S  0.3  0.1   0:00.49 httpd
 2106 root      20   0 15252 1220  924 R  0.3  0.0   0:00.03 top
    1 root      20   0 19600 1600 1292 S  0.0  0.0   0:01.32 init
    2 root      20   0     0    0    0 S  0.0  0.0   0:00.00 kthreadd
    3 root      20   0     0    0    0 S  0.0  0.0   0:00.00 ksoftirqd/0
    4 root      20   0     0    0    0 S  0.0  0.0   0:00.00 kworker/0:0
    5 root       0 -20     0    0    0 S  0.0  0.0   0:00.00 kworker/0:0H
    6 root      20   0     0    0    0 S  0.0  0.0   0:00.02 kworker/u8:0
    7 root      20   0     0    0    0 S  0.0  0.0   0:00.06 rcu_sched
    8 root      20   0     0    0    0 S  0.0  0.0   0:00.00 rcu_bh
    9 root      RT   0     0    0    0 S  0.0  0.0   0:00.00 migration/0
   10 root      RT   0     0    0    0 S  0.0  0.0   0:00.02 migration/1
   11 root      20   0     0    0    0 S  0.0  0.0   0:00.00 ksoftirqd/1
   12 root      20   0     0    0    0 S  0.0  0.0   0:00.00 kworker/1:0
   13 root       0 -20     0    0    0 S  0.0  0.0   0:00.00 kworker/1:0H
   14 root      RT   0     0    0    0 S  0.0  0.0   0:00.00 migration/2
   15 root      20   0     0    0    0 S  0.0  0.0   0:00.00 ksoftirqd/2
   16 root      20   0     0    0    0 S  0.0  0.0   0:00.00 kworker/2:0
   17 root       0 -20     0    0    0 S  0.0  0.0   0:00.00 kworker/2:0H
   18 root      RT   0     0    0    0 S  0.0  0.0   0:00.00 migration/3
   19 root      20   0     0    0    0 S  0.0  0.0   0:00.00 ksoftirqd/3
   20 root      20   0     0    0    0 S  0.0  0.0   0:00.00 kworker/3:0
   21 root       0 -20     0    0    0 S  0.0  0.0   0:00.00 kworker/3:0H
   22 root       0 -20     0    0    0 S  0.0  0.0   0:00.00 khelper
   23 root      20   0     0    0    0 S  0.0  0.0   0:00.00 kdevtmpfs
   24 root       0 -20     0    0    0 S  0.0  0.0   0:00.00 netns
   25 root      20   0     0    0    0 S  0.0  0.0   0:00.02 kworker/u8:1
   31 root      20   0     0    0    0 S  0.0  0.0   0:00.00 xenwatch
   32 root      20   0     0    0    0 S  0.0  0.0   0:00.00 xenbus
  127 root       0 -20     0    0    0 S  0.0  0.0   0:00.00 writeback
  130 root      25   5     0    0    0 S  0.0  0.0   0:00.00 ksmd
  131 root       0 -20     0    0    0 S  0.0  0.0   0:00.00 kintegrityd
  132 root       0 -20     0    0    0 S  0.0  0.0   0:00.00 bioset
  133 root       0 -20     0    0    0 S  0.0  0.0   0:00.00 crypto
  135 root       0 -20     0    0    0 S  0.0  0.0   0:00.00 kblockd
gheistCommented:
Try this:
sync; sync; sysctl vm.drop_caches=3 vm.drop_caches=0

And check in slabinfo if numbers in this line reduce:
dentry            136115931 136115931    192   21    1 : tunables    0    0    0 : slabdata 6481711 6481711      0

Open in new window

If they reduce there is no kernel memory leak.

Does the amazon have dmesg? Any invocations of "OOM killer"?
sivaatluriAuthor Commented:
gheist,

Could you please let me know where(config files) to do these changes. Im not sure about the suggested changes
sivaatluriAuthor Commented:
Does the amazon have dmesg? DO you want dmesg command output?

Any invocations of "OOM killer"?  COUld you please let me know regarding this
sivaatluriAuthor Commented:
Below is dmesg output, Server is restarted just before running this command so memory usage is normal nw.

[root@ip-10-0-53-20 ~]# dmesg
[    0.000000] Initializing cgroup subsys cpuset
[    0.000000] Initializing cgroup subsys cpu
[    0.000000] Initializing cgroup subsys cpuacct
[    0.000000] Linux version 3.14.44-32.39.amzn1.x86_64 (mockbuild@gobi-build-64011) (gcc version 4.8.2 20140120 (Red Hat 4.8.2-16) (GCC) ) #1 SMP Thu Jun 11 20:33:38 UTC 2015
[    0.000000] Command line: root=LABEL=/ console=hvc0 LANG=en_US.UTF-8 KEYTABLE=us
[    0.000000] ACPI in unprivileged domain disabled
[    0.000000] e820: BIOS-provided physical RAM map:
[    0.000000] Xen: [mem 0x0000000000000000-0x000000000009ffff] usable
[    0.000000] Xen: [mem 0x00000000000a0000-0x00000000000fffff] reserved
[    0.000000] Xen: [mem 0x0000000000100000-0x00000003c07fffff] usable
[    0.000000] NX (Execute Disable) protection: active
[    0.000000] DMI not present or invalid.
[    0.000000] e820: update [mem 0x00000000-0x00000fff] usable ==> reserved
[    0.000000] e820: remove [mem 0x000a0000-0x000fffff] usable
[    0.000000] No AGP bridge found
[    0.000000] e820: last_pfn = 0x3c0800 max_arch_pfn = 0x400000000
[    0.000000] e820: last_pfn = 0x100000 max_arch_pfn = 0x400000000
[    0.000000] Base memory trampoline at [ffff88000009a000] 9a000 size 24576
[    0.000000] init_memory_mapping: [mem 0x00000000-0x000fffff]
[    0.000000]  [mem 0x00000000-0x000fffff] page 4k
[    0.000000] init_memory_mapping: [mem 0x3bfe00000-0x3bfffffff]
[    0.000000]  [mem 0x3bfe00000-0x3bfffffff] page 4k
[    0.000000] BRK [0x01dbc000, 0x01dbcfff] PGTABLE
[    0.000000] BRK [0x01dbd000, 0x01dbdfff] PGTABLE
[    0.000000] init_memory_mapping: [mem 0x3bc000000-0x3bfdfffff]
[    0.000000]  [mem 0x3bc000000-0x3bfdfffff] page 4k
[    0.000000] BRK [0x01dbe000, 0x01dbefff] PGTABLE
[    0.000000] BRK [0x01dbf000, 0x01dbffff] PGTABLE
[    0.000000] BRK [0x01dc0000, 0x01dc0fff] PGTABLE
[    0.000000] BRK [0x01dc1000, 0x01dc1fff] PGTABLE
[    0.000000] init_memory_mapping: [mem 0x380000000-0x3bbffffff]
[    0.000000]  [mem 0x380000000-0x3bbffffff] page 4k
[    0.000000] init_memory_mapping: [mem 0x00100000-0x37fffffff]
[    0.000000]  [mem 0x00100000-0x37fffffff] page 4k
[    0.000000] init_memory_mapping: [mem 0x3c0000000-0x3c07fffff]
[    0.000000]  [mem 0x3c0000000-0x3c07fffff] page 4k
[    0.000000] RAMDISK: [mem 0x021b6000-0x04815fff]
[    0.000000] NUMA turned off
[    0.000000] Faking a node at [mem 0x0000000000000000-0x00000003c07fffff]
[    0.000000] Initmem setup node 0 [mem 0x00000000-0x3c07fffff]
[    0.000000]   NODE_DATA [mem 0x3be200000-0x3be226fff]
[    0.000000] Zone ranges:
[    0.000000]   DMA      [mem 0x00001000-0x00ffffff]
[    0.000000]   DMA32    [mem 0x01000000-0xffffffff]
[    0.000000]   Normal   [mem 0x100000000-0x3c07fffff]
[    0.000000] Movable zone start for each node
[    0.000000] Early memory node ranges
[    0.000000]   node   0: [mem 0x00001000-0x0009ffff]
[    0.000000]   node   0: [mem 0x00100000-0x3c07fffff]
[    0.000000] On node 0 totalpages: 3934111
[    0.000000]   DMA zone: 64 pages used for memmap
[    0.000000]   DMA zone: 21 pages reserved
[    0.000000]   DMA zone: 3999 pages, LIFO batch:0
[    0.000000]   DMA32 zone: 16320 pages used for memmap
[    0.000000]   DMA32 zone: 1044480 pages, LIFO batch:31
[    0.000000]   Normal zone: 45088 pages used for memmap
[    0.000000]   Normal zone: 2885632 pages, LIFO batch:31
[    0.000000] smpboot: Allowing 4 CPUs, 0 hotplug CPUs
[    0.000000] nr_irqs_gsi: 16
[    0.000000] e820: cannot find a gap in the 32bit address range
e820: PCI devices with unassigned 32bit BARs may break!
[    0.000000] e820: [mem 0x3c0900000-0x3c0cfffff] available for PCI devices
[    0.000000] Booting paravirtualized kernel on Xen
[    0.000000] Xen version: 4.2.amazon (preserve-AD)
[    0.000000] setup_percpu: NR_CPUS:64 nr_cpumask_bits:64 nr_cpu_ids:4 nr_node_ids:1
[    0.000000] PERCPU: Embedded 27 pages/cpu @ffff8803bde00000 s79104 r8192 d23296 u524288
[    0.000000] pcpu-alloc: s79104 r8192 d23296 u524288 alloc=1*2097152
[    0.000000] pcpu-alloc: [0] 0 1 2 3
[    0.000000] xen: PV spinlocks enabled
[    0.000000] Built 1 zonelists in Zone order, mobility grouping on.  Total pages: 3872618
[    0.000000] Policy zone: Normal
[    0.000000] Kernel command line: root=LABEL=/ console=hvc0 LANG=en_US.UTF-8 KEYTABLE=us
[    0.000000] PID hash table entries: 4096 (order: 3, 32768 bytes)
[    0.000000] Checking aperture...
[    0.000000] No AGP bridge found
[    0.000000] Memory: 15364040K/15736444K available (4716K kernel code, 843K rwdata, 2168K rodata, 1088K init, 1704K bss, 372404K reserved)
[    0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=4, Nodes=1
[    0.000000] Hierarchical RCU implementation.
[    0.000000]  Additional per-CPU info printed with stalls.
[    0.000000]  RCU restricting CPUs from NR_CPUS=64 to nr_cpu_ids=4.
[    0.000000] RCU: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=4
[    0.000000] NR_IRQS:4352 nr_irqs:304 16
[    0.000000] xen:events: Using 2-level ABI
[    0.000000] Console: colour dummy device 80x25
[    0.000000] console [tty0] enabled
[    0.000000] console [hvc0] enabled
[    0.000000] allocated 63438848 bytes of page_cgroup
[    0.000000] please try 'cgroup_disable=memory' option if you don't want memory cgroups
[    0.000000] Xen: using vcpuop timer interface
[    0.000000] installing Xen timer for CPU 0
[    0.000000] tsc: Detected 2500.046 MHz processor
[    0.004000] Calibrating delay loop (skipped), value calculated using timer frequency.. 5000.09 BogoMIPS (lpj=10000184)
[    0.004000] pid_max: default: 32768 minimum: 301
[    0.004000] Security Framework initialized
[    0.004000] Dentry cache hash table entries: 2097152 (order: 12, 16777216 bytes)
[    0.006223] Inode-cache hash table entries: 1048576 (order: 11, 8388608 bytes)
[    0.007767] Mount-cache hash table entries: 32768 (order: 6, 262144 bytes)
[    0.007806] Mountpoint-cache hash table entries: 32768 (order: 6, 262144 bytes)
[    0.008043] Initializing cgroup subsys memory
[    0.008053] Initializing cgroup subsys devices
[    0.008056] Initializing cgroup subsys freezer
[    0.008060] Initializing cgroup subsys net_cls
[    0.008063] Initializing cgroup subsys blkio
[    0.008067] Initializing cgroup subsys perf_event
[    0.008071] Initializing cgroup subsys hugetlb
[    0.008126] ENERGY_PERF_BIAS: Set to 'normal', was 'performance'
ENERGY_PERF_BIAS: View and update with x86_energy_perf_policy(8)
[    0.008135] CPU: Physical Processor ID: 1
[    0.008137] CPU: Processor Core ID: 1
[    0.008889] Last level iTLB entries: 4KB 512, 2MB 8, 4MB 8
Last level dTLB entries: 4KB 512, 2MB 0, 4MB 0, 1GB 4
tlb_flushall_shift: 6
[    0.034708] ftrace: allocating 18781 entries in 74 pages
[    0.040090] cpu 0 spinlock event irq 17
[    0.047871] Performance Events: unsupported p6 CPU model 62 no PMU driver, software events only.
[    0.048666] installing Xen timer for CPU 1
[    0.048680] cpu 1 spinlock event irq 24
[    0.048730] SMP alternatives: switching to SMP code
[    0.072265] installing Xen timer for CPU 2
[    0.072279] cpu 2 spinlock event irq 31
[    0.073250] installing Xen timer for CPU 3
[    0.073262] cpu 3 spinlock event irq 38
[    0.074123] x86: Booted up 1 node, 4 CPUs
[    0.074189] devtmpfs: initialized
[    0.077566] NET: Registered protocol family 16
[    0.077566] xen:grant_table: Grant tables using version 1 layout
[    0.077566] Grant table initialized
[    0.077566] PCI: setting up Xen PCI frontend stub
[    0.077566] PCI: pci_cache_line_size set to 64 bytes
[    0.084036] bio: create slab <bio-0> at 0
[    0.084036] ACPI: Interpreter disabled.
[    0.084036] xen:balloon: Initialising balloon driver
[    0.084072] vgaarb: loaded
[    0.084072] PCI: System does not support PCI
[    0.084072] PCI: System does not support PCI
[    0.084072] NetLabel: Initializing
[    0.084072] NetLabel:  domain hash size = 128
[    0.084072] NetLabel:  protocols = UNLABELED CIPSOv4
[    0.084072] NetLabel:  unlabeled traffic allowed by default
[    0.084072] Switched to clocksource xen
[    0.089458] pnp: PnP ACPI: disabled
[    0.094982] NET: Registered protocol family 2
[    0.095260] TCP established hash table entries: 131072 (order: 8, 1048576 bytes)
[    0.095545] TCP bind hash table entries: 65536 (order: 8, 1048576 bytes)
[    0.095675] TCP: Hash tables configured (established 131072 bind 65536)
[    0.095709] TCP: reno registered
[    0.095747] UDP hash table entries: 8192 (order: 6, 262144 bytes)
[    0.095816] UDP-Lite hash table entries: 8192 (order: 6, 262144 bytes)
[    0.095902] NET: Registered protocol family 1
[    0.095910] PCI: CLS 0 bytes, default 64
[    0.095945] Unpacking initramfs...
[    0.132647] Freeing initrd memory: 39296K (ffff8800021b6000 - ffff880004816000)
[    0.132798] platform rtc_cmos: registered platform RTC device (no PNP device found)
[    0.134301] futex hash table entries: 1024 (order: 4, 65536 bytes)
[    0.134343] audit: initializing netlink subsys (disabled)
[    0.134363] audit: type=2000 audit(1436940397.902:1): initialized
[    0.149775] bounce pool size: 64 pages
[    0.149801] HugeTLB registered 2 MB page size, pre-allocated 0 pages
[    0.153761] zbud: loaded
[    0.153902] VFS: Disk quotas dquot_6.5.2
[    0.153988] Dquot-cache hash table entries: 512 (order 0, 4096 bytes)
[    0.179732] msgmni has been set to 30084
[    0.180345] alg: No test for stdrng (krng)
[    0.180359] Key type asymmetric registered
[    0.180500] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 254)
[    0.180555] io scheduler noop registered (default)
[    0.180630] pci_hotplug: PCI Hot Plug PCI Core version: 0.5
[    0.180691] intel_idle: does not run on family 6 model 62
[    0.181639] Serial: 8250/16550 driver, 4 ports, IRQ sharing disabled
[    0.190956] xen_netfront: Initialising Xen virtual ethernet driver
[    0.191652] blkfront: xvda1: barrier or flush: disabled; persistent grants: disabled; indirect descriptors: enabled;
[    0.193428] i8042: PNP: No PS/2 controller found. Probing ports directly.
[    0.206442] blkfront: xvdb: flush diskcache: enabled; persistent grants: disabled; indirect descriptors: enabled;
[    0.207417]  xvdb: unknown partition table
[    1.194992] mousedev: PS/2 mouse device common for all mice
[    1.195211] hidraw: raw HID events driver (C) Jiri Kosina
[    1.195360] TCP: cubic registered
[    1.195366] NET: Registered protocol family 17
[    1.195642] registered taskstats version 1
[    1.196325] Freeing unused kernel memory: 1088K (ffffffff81ad4000 - ffffffff81be4000)
[    1.196332] Write protecting the kernel read-only data: 10240k
[    1.199390] Freeing unused kernel memory: 1416K (ffff88000149e000 - ffff880001600000)
[    1.200330] Freeing unused kernel memory: 1928K (ffff88000181e000 - ffff880001a00000)
[    1.235460] device-mapper: uevent: version 1.0.3
[    1.235628] device-mapper: ioctl: 4.27.0-ioctl (2013-10-30) initialised: dm-devel@redhat.com
[    1.244073] udevd[434]: starting version 173
[    1.262489] SSE version of gcm_enc/dec engaged.
[    1.267855] alg: No test for __gcm-aes-aesni (__driver-gcm-aes-aesni)
[    1.443387] EXT4-fs (xvda1): mounted filesystem with ordered data mode. Opts: (null)
[    1.484789] dracut: Remounting /dev/disk/by-label/\x2f with -o noatime,ro
[    1.495653] EXT4-fs (xvda1): mounted filesystem with ordered data mode. Opts: (null)
[    1.497819] dracut: Mounted root filesystem /dev/xvda1
[    1.612645] dracut: Loading SELinux policy
[    1.662837] random: nonblocking pool is initialized
[    1.902100] dracut: /sbin/load_policy: Can't load policy: No such device
[    1.988529] dracut: Switching root
[    3.798237] udevd[823]: starting version 173
[    4.131064] microcode: CPU0 sig=0x306e4, pf=0x1, revision=0x416
[    4.131897] microcode: CPU1 sig=0x306e4, pf=0x1, revision=0x416
[    4.131924] microcode: CPU2 sig=0x306e4, pf=0x1, revision=0x416
[    4.131951] microcode: CPU3 sig=0x306e4, pf=0x1, revision=0x416
[    4.132026] microcode: Microcode Update Driver: v2.00 <tigran@aivazian.fsnet.co.uk>, Peter Oruba
[    4.166158] alg: No test for crc32 (crc32-pclmul)
[    4.721847] EXT4-fs (xvda1): re-mounted. Opts: (null)
[    4.758191] EXT4-fs (xvdb): mounted filesystem with ordered data mode. Opts: (null)
[    5.310169] NET: Registered protocol family 10
[    8.303617] audit: type=1305 audit(1436940406.070:2): audit_pid=1247 old=0 auid=4294967295 ses=4294967295 res=1
[  125.768093] device eth0 entered promiscuous mode
sivaatluriAuthor Commented:
[root@ip-10-0-53-20 ~]# dmesg | grep -i memory
[    0.000000] Base memory trampoline at [ffff88000009a000] 9a000 size 24576
[    0.000000] init_memory_mapping: [mem 0x00000000-0x000fffff]
[    0.000000] init_memory_mapping: [mem 0x3bfe00000-0x3bfffffff]
[    0.000000] init_memory_mapping: [mem 0x3bc000000-0x3bfdfffff]
[    0.000000] init_memory_mapping: [mem 0x380000000-0x3bbffffff]
[    0.000000] init_memory_mapping: [mem 0x00100000-0x37fffffff]
[    0.000000] init_memory_mapping: [mem 0x3c0000000-0x3c07fffff]
[    0.000000] Early memory node ranges
[    0.000000] Memory: 15364040K/15736444K available (4716K kernel code, 843K rwdata, 2168K rodata, 1088K init, 1704K bss, 372404K reserved)
[    0.000000] please try 'cgroup_disable=memory' option if you don't want memory cgroups
[    0.008043] Initializing cgroup subsys memory
[    0.132647] Freeing initrd memory: 39296K (ffff8800021b6000 - ffff880004816000)
[    1.196325] Freeing unused kernel memory: 1088K (ffffffff81ad4000 - ffffffff81be4000)
[    1.199390] Freeing unused kernel memory: 1416K (ffff88000149e000 - ffff880001600000)
[    1.200330] Freeing unused kernel memory: 1928K (ffff88000181e000 - ffff880001a00000)
gheistCommented:
Command line as root?

# sync ; sync ; sysctl ....

Something is WRONG - do you know why network adapter is in promiscuous mode?
sivaatluriAuthor Commented:
Snort IDS is installed on it, we have two machines one with snort & another with out it.

On the second machine without snort also we are facing similiar memory usage issue.

Ghesit sorry this time i also I didnt understand syn : sync sysctl
gheistCommented:
Type commands in the command line as root to clean up disk cache in RAM.
sivaatluriAuthor Commented:
Gheist,

Below is the output, seems some issue.

[root@ip-10-0-53-10 ~]#  sync; sync; sysctl vm.drop_caches=3 vm.drop_caches=0
vm.drop_caches = 3
error: "Invalid argument" setting key "vm.drop_caches"

Could you please let me know what is the use of this command & what it will do
gheistCommented:
Now you are expected to check slabinfo section - if kernel buffers were freed.

i.e
# grep ^dentry /proc/slabinfo
# sync ; sync ; sysctl vm.drop_caches=3
# grep ^dentry /proc/slabinfo
# free
??? Should look much better?
drop_caches=0 is needed for older kernels to re-enable cache, new kernels (like 3.14 you have) just release memory once and never disable cache completely.
sivaatluriAuthor Commented:
[root@ip-10-0-53-10 ~]# grep ^dentry /proc/slabinfo
dentry             88116  88116    192   21    1 : tunables    0    0    0 : slabdata   4196   4196      0

[root@ip-10-0-53-10 ~]# sync ; sync ; sysctl vm.drop_caches=3
vm.drop_caches = 3

[root@ip-10-0-53-10 ~]# grep ^dentry /proc/slabinfo
dentry              8628  12327    192   21    1 : tunables    0    0    0 : slabdata    587    587      0

[root@ip-10-0-53-10 ~]# free -m
             total       used       free     shared    buffers     cached
Mem:         30105        563      29542          0          5         21
-/+ buffers/cache:        536      29569
Swap:            0          0          0

[root@ip-10-0-53-10 ~]# uname -r
3.14.42-31.38.amzn1.x86_64
[root@ip-10-0-53-10 ~]#

Seems my machine is of New Kernel itself
sivaatluriAuthor Commented:
[root@ip-10-0-53-10 ~]# sync ; sync ; sysctl vm.drop_caches=3

Does this command have impact on any of the services?
gheistCommented:
You have 30GB FREE
Wherever you detected "high memory usage"was plain wrong

- disk cache will be freed to make space for new allocations, otherwise unused memory is used to hold disk blocks to reduce disk usage, for all practical purposes it should be considered "free"
sivaatluriAuthor Commented:
Ok Ghiest,

 I will check again by running the command once I see High memory usage
gheistCommented:
Bold is caches and underscored is available memory for applications.
You can do
$ tar cf /dev/null /usr
(i.e read all /usr) to fill caches wildly

How do you detect high memory usage? I think we need to fix the probe, not system.

Mem:         30105        563      29542          0          5         21
-/+ buffers/cache:        536      29569
Swap:            0          0          0
sivaatluriAuthor Commented:
Normally when I run free -m command, take for example the attached image. I will see free (2) as around 1000


understanding-free-command.png
gheistCommented:
Try to not nibble systems behind the scenes and focus on system with 32GB RAM that uses same 2GB as this system.

In neither case there is a problem.
sivaatluriAuthor Commented:
I didn't really get what you were telling, please let me know what you would like to do for finding issues
arnoldCommented:
Run top
use o to choose the order option, then hit Q, S or N until it is at the top (shift to move it up).

Need to identify the process that consume your memory.
sivaatluriAuthor Commented:
Arnold If I stop httpd service after restart it will show almost no usage.

Only the issue is been seen after running httpd continuously for 4-5 days
gheistCommented:
One time free is ran on system with 32GB of RAM, other time on 10GB system, then on 16GB
Would be easier to cross-link outouts if they came from same system all the time.
arnoldCommented:
Without identifying what you have running, it is hard to say.  If Apache is not running, look at what you do gave running, php are not terminated when Apache terminates.  MySQL, check services make sure only what you need is running on the system.  Look at using ls of to identify the resource consumer.
gheistCommented:
14MB apache process is "normal", you can make it a bit under 10MB by eagerly ripping out standard modules.
for 38 connections I would not care, but getting 100+ would mandate investing time in better MPM model, or NGINX.
sivaatluriAuthor Commented:
Arnold,

See below is what I got

Current Fields:  AEHIOTQWKMNbcdfgjplrusvyzX  for window 1:Def
Upper case letter moves field left, lower case right
* A: PID        = Process Id
* E: USER       = User Nameeage (kb)
* H: PR         = Priorityueage (kb)
* I: NI         = Nice valueage (kb)b)
* O: VIRT       = Virtual Image (kb)b)
* T: SHR        = Shared Mem size (kb)
* Q: RES        = Resident size (kb)
* W: S          = Process StatusRES)hs
* K: %CPU       = CPU usageage (RES)hs
* M: TIME+      = CPU Time, hundredths
* N: %MEM       = Memory usage (RES)
  b: PPID       = Parent Process Pid
  c: RUSER      = Real user namey
  d: UID        = User Idameg TtySMP)
  f: GROUP      = Group Nameg TtySMP)
  g: TTY        = Controlling TtySMP)
  j: P          = Last used cpu (SMP)
  p: SWAP       = Swapped size (kb)kb)
  l: TIME       = CPU Timee (kb)e (kb)
  r: CODE       = Code size (kb)ntt
  u: nFLT       = Page Fault count(kb)
  s: DATA       = Data+Stack size (kb)
  v: nDRT       = Dirty Pages countion
  y: WCHAN      = Sleeping in Function
  z: Flags      = Task Flags <sched.h>
* X: COMMAND    = Command name/line
Flags field:  PF_ALIGNWARN
Flags field:  PF_ALIGNWARN
  0x00000001  PF_ALIGNWARNC
  0x00000002  PF_STARTINGEC
  0x00000004  PF_EXITINGXEC
  0x00000040  PF_FORKNOEXEC
  0x00000100  PF_SUPERPRIV
  0x00000200  PF_DUMPCOREES (2.5)
  0x00000400  PF_SIGNALEDES (2.5)
  0x00000800  PF_MEMALLOCES (2.5)5)
  0x00002000  PF_FREE_PAGES (2.5)5)
  0x00008000  debug flag (2.5)(2.5)
  0x00024000  special threads (2.5)
  0x001D0000  special states (2.5))
  0x00100000  PF_USEDFPU (thru 2.4)
gheistCommented:
Can we get on the probe/alarm that you use for high memory consumption?
sivaatluriAuthor Commented:
14MB apache process is "normal", you can make it a bit under 10MB by eagerly ripping out standard modules.
for 38 connections I would not care, but getting 100+ would mandate investing time in better MPM model, or NGINX.

--- Gheist I agree, if apache is eating the memory then it should release after stopping the service. which is not happening.


As of Now I have 30 GB machine & 16 Gb machine running seeing same issue on both the systems.....16 Gb will runout faster (4-5 days)  32 Gb will run out after 10 days :-)
sivaatluriAuthor Commented:
Can we get on the probe/alarm that you use for high memory consumption?


I didn't understand what it mean...I use free -m to check memory usage, my http server is serving apache content(zend framework)
gheistCommented:
None of your exhibits show extreme memory usage. Can you elaborate HOW you measured it high memory consumption? SNMP? zabbix? nagio?
arnoldCommented:
The output from top you posted

Mem:  30828108k total, 30538552k used,   289556k free,   257320k buffers
Is this system also functions as a host for Virtual Machines?

move Q, N to the top. This will tell you who consumes the memory.

Snort/IPTABLEs likely consume this excess memory.
Unfortunately with these dropping iptables/isabling snort is not ......

Could you layout your entire setup?
We keep working on the original question but additional information keeps trickling in.

As gheist and others pointed out, the web server (apache) does not seem to be the memory hog of the system.
Identifying what services are running on the system that you are having an issue with and then narrowing it down whether the service should be running or not and whether the current configuration of the offending service is not setup optimally allowing it to consume too much memory, or your split services/system resource allocation is incorrect for what you want to achieve and get from the system.
sivaatluriAuthor Commented:
Arnold/Gheist,

Thanks for helping me in trying to solve the issue

 Please give 3-4 days for me, I will post the outputs once the memory is highly used which can help us track
gheistCommented:
You still did not tell how you get the idea system has "high memory usage". Is it some monitoring tool?

So far you show greatly oversized systems with <2GB RAM used and 8-30GB used for disk cache
sivaatluriAuthor Commented:
I will use the command free -m & we have icinga in place.
gheistCommented:
My laptop shows this (next version of psmisc after amazons):

$ free -m
              total        used        free      shared  buff/cache   available
Mem:           7123        2048         438          18        4636        4963
Swap:          8191           0        8191

Open in new window


So what is your opinion - how well it is? Can I use it for another week or I need to reboot it nightly?


Can you show NAGIOS probe that rings alarms? I suspect it wants 10% RAM free, when actually free memory is configurable and 16mb for 1GB system to 128MB for 1TB syste,
sivaatluriAuthor Commented:
I had added top & free -m command outputs.

Could you please check & let me know what you need
sivaatluriAuthor Commented:
arnoldCommented:
There is nothing in top indicating where this memory is consumed. Unfortunately, the sorting is not arranging the data in a useful way.

Your option is likely to use lsof
You have something running that reserves/consumes the memory.  Look at which services you have running and make sure all the services that are running are the services you need.

Iptables/firewall on the system would potentially explain this, do not drop firewall..
arnoldCommented:
You do not have tempts space used for swap nor do you seem to have swap space on drives.
arnoldCommented:
fdisk -l
parted -l
Disk layouts to see if any have a swap partition
sivaatluriAuthor Commented:
Arnold/Ghiest,

Running this command is freeing all the memory,like if the memory is 15 gb used after running this its getting back to 200 MB like that, is there any negative effects of running this.

Like closing current connections etc


 free && sync && echo 3 > /proc/sys/vm/drop_caches && free
arnoldCommented:
Your system does not have any swap space dedicated The information you provide dos not point to anything in reticular that one can point to as the item/s consuming your memory.

It's like saying that I have a box of 100 by 100 by 100 and I can not fit a 20 by 20 by 20 box into it.
All I post is the picture of the 20 by 20 by 20 box from different angles and different sides.

Something in the larger box consumes space that the box I am trying to fit is just too large.

Without identifying what is running on your system first, then identifying how much memory that consumes and whether it needs to run. I have no additional useful suggestions to make.

You have something that consumes the memory, I.e. Do you have a RAM disk setup, etc.
The OS is loaded into memory in its entirety.......

When you setup this system what process/procedure did you use, are those instructions accessible/viewable on the net?
gheistCommented:
Your probe accounts free memory wrong. Ditto. No need to flush caches to work around faulty probe.
sivaatluriAuthor Commented:
How to correct the probe gheist?
gheistCommented:
Initial question is sufficiently answered by finding that indeed your system has moderate memory use.
You can ask in other areas to fix monitoring scripts that disagree (best is to report issue to person who made monitoring script)
sivaatluriAuthor Commented:
Gheist,,

I'm also facing issue with free -m command too & not only monitoring script
gheistCommented:
It is normal that disk cache takes all free memory. Under amazon you need to provide memory that is "used" in 2nd line of your "free" command and assume huge caches are in place outside your virtual machine. Your probe and your reading of "free" output wrongly assumes disk cache as USED, when actually it will be freed if any program needs it.
If your probe wants 10% of 100GB free you can set sysctl vm.min_free_kbytes to 11GB and make sure probe is happy at the  price of never touching 10GB of RAM

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
sivaatluriAuthor Commented:
thank you
gheistCommented:
https://www.kernel.org/doc/Documentation/sysctl/vm.txt
Pay attention to
swappiness
overcommit_memory
*pressure*
(or read it through to get better picture)
Dont change anything unless you can measure effect.
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Apache Web Server

From novice to tech pro — start learning today.