Want to protect your cyber security and still get fast solutions? Ask a secure question today.Go Premium

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 1453
  • Last Modified:

OOM killer issue on Linux server

Hello Linux Experts,

We have been facing this OOM (Out Of Memory) issue on a Linux Server frequently.  The worst part is, this OOM issue is happening even when there is not much load on the server.

Some steps that we have taken to prevent this occurrence is listed below. Please let me know if you find any suspicious which likely cause this OOM issue, with the log message and system details that I given below:

i)  Set the Kernel parameter “vm.overcommit_memory” to value "2" so that it would prevent the processes from over-committing memory resources
ii) Noticed Swap memory isn’t been used at all.  To influence the kernel to use more Swap memory against Caching, the kernel parameter “vm.swappiness” is set to highest value of 100.
iii) Scheduled a cron job to clear the cache memory periodically.  Command used: echo 1 > /proc/sys/vm/drop_caches.


Aug 26 18:12:31 oraclelinux13 kernel: Free pages:    65678612kB (65671388kB HighMem)
Aug 26 18:12:31 oraclelinux13 kernel: Active:37449 inactive:21866 dirty:0 writeback:0 unstable:0 free:16419653 slab:4879 mapped-file:7186 mapped-anon:29458 pagetables:808
Aug 26 18:12:31 oraclelinux13 kernel: DMA free:3588kB min:68kB low:84kB high:100kB active:12kB inactive:0kB present:16384kB pages_scanned:73671 all_unreclaimable? yes
Aug 26 18:12:31 oraclelinux13 kernel: lowmem_reserve[]: 0 0 880 65520
Aug 26 18:12:31 oraclelinux13 kernel: DMA32 free:0kB min:0kB low:0kB high:0kB active:0kB inactive:0kB present:0kB pages_scanned:0 all_unreclaimable? no
Aug 26 18:12:31 oraclelinux13 kernel: lowmem_reserve[]: 0 0 880 65520
Aug 26 18:12:31 oraclelinux13 kernel: Normal free:3636kB min:3756kB low:4692kB high:5632kB active:252kB inactive:288kB present:901120kB pages_scanned:785248 all_unreclaimable? yes
Aug 26 18:12:31 oraclelinux13 kernel: lowmem_reserve[]: 0 0 0 517120
Aug 26 18:12:31 oraclelinux13 kernel: HighMem free:65671388kB min:512kB low:69552kB high:138592kB active:149532kB inactive:87176kB present:66191360kB pages_scanned:0 all_unreclaimable? no
Aug 26 18:12:31 oraclelinux13 kernel: lowmem_reserve[]: 0 0 0 0
Aug 26 18:12:32 oraclelinux13 kernel: DMA: 1*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 1*512kB 1*1024kB 1*2048kB 0*4096kB = 3588kB
Aug 26 18:12:32 oraclelinux13 kernel: DMA32: empty
Aug 26 18:12:32 oraclelinux13 kernel: Normal: 1*4kB 6*8kB 4*16kB 2*32kB 0*64kB 1*128kB 1*256kB 0*512kB 1*1024kB 1*2048kB 0*4096kB = 3636kB
Aug 26 18:12:32 oraclelinux13 kernel: HighMem: 1713*4kB 4407*8kB 2770*16kB 1562*32kB 778*64kB 245*128kB 119*256kB 72*512kB 24*1024kB 13*2048kB 15951*4096kB = 65671388kB
Aug 26 18:12:32 oraclelinux13 kernel: 29867 pagecache pages
Aug 26 18:12:32 oraclelinux13 kernel: Swap cache: add 0, delete 0, find 0/0, race 0+0
Aug 26 18:12:32 oraclelinux13 kernel: Free swap  = 101378140kB
Aug 26 18:12:32 oraclelinux13 kernel: Total swap = 101378140kB
Aug 26 18:12:32 oraclelinux13 kernel: Free swap:       101378140kB
Aug 26 18:12:32 oraclelinux13 kernel: 16777216 pages of RAM
Aug 26 18:12:32 oraclelinux13 kernel: 16547840 pages of HIGHMEM
Aug 26 18:12:32 oraclelinux13 kernel: 184317 reserved pages
Aug 26 18:12:32 oraclelinux13 kernel: 99337 pages shared
Aug 26 18:12:32 oraclelinux13 kernel: 0 pages swap cached
Aug 26 18:12:32 oraclelinux13 kernel: 0 pages dirty
Aug 26 18:12:32 oraclelinux13 kernel: 0 pages writeback
Aug 26 18:12:32 oraclelinux13 kernel: 7186 pages mapped
Aug 26 18:12:32 oraclelinux13 kernel: 4879 pages slab
Aug 26 18:12:32 oraclelinux13 kernel: 808 pages pagetables
Aug 26 18:12:32 oraclelinux13 kernel: Out of memory: Killed process 25662, UID 200, (perl).
Aug 26 18:12:32 oraclelinux13 kernel: ps invoked oom-killer: gfp_mask=0xd0, order=0, oomkilladj=0
Aug 26 18:12:32 oraclelinux13 kernel:  [<c045af33>] out_of_memory+0x72/0x1a3
Aug 26 18:12:32 oraclelinux13 kernel:  [<c045c49a>] __alloc_pages+0x24e/0x2cf
Aug 26 18:12:32 oraclelinux13 kernel:  [<c04a6fa8>] proc_info_read+0x0/0x96
Aug 26 18:12:32 oraclelinux13 kernel:  [<c045c540>] __get_free_pages+0x25/0x31
Aug 26 18:12:32 oraclelinux13 kernel:  [<c04a6fe0>] proc_info_read+0x38/0x96
Aug 26 18:12:32 oraclelinux13 kernel:  [<c04a6fa8>] proc_info_read+0x0/0x96
Aug 26 18:12:32 oraclelinux13 kernel:  [<c0476330>] vfs_read+0x9f/0x141
Aug 26 18:12:32 oraclelinux13 kernel:  [<c04767b6>] sys_read+0x3c/0x63
Aug 26 18:12:32 oraclelinux13 kernel:  [<c0404f4b>] syscall_call+0x7/0xb
Aug 26 18:12:32 oraclelinux13 kernel:  =======================
Aug 26 18:12:32 oraclelinux13 kernel: Mem-info:
Aug 26 18:12:32 oraclelinux13 kernel: DMA per-cpu:


[root@oraclelinux13 ~]# uname -r
2.6.18-238.el5PAE
[root@oraclelinux13 ~]# uname -m
i686
[root@oraclelinux13 ~]# cat /proc/meminfo
MemTotal:     66371596 kB
MemFree:      65481140 kB
Buffers:        195400 kB
Cached:         375332 kB
SwapCached:          0 kB
Active:         351136 kB
Inactive:       336556 kB
HighTotal:    65984704 kB
HighFree:     65418272 kB
LowTotal:       386892 kB
LowFree:         62868 kB
SwapTotal:    101378140 kB
SwapFree:     101378140 kB
Dirty:              80 kB
Writeback:           0 kB
AnonPages:      117100 kB
Mapped:          27480 kB
Slab:            41200 kB
PageTables:       3692 kB
NFS_Unstable:        0 kB
Bounce:              0 kB
CommitLimit:  134563936 kB
Committed_AS:   306704 kB
VmallocTotal:   116728 kB
VmallocUsed:     60856 kB
VmallocChunk:    54296 kB
HugePages_Total:     0
HugePages_Free:      0
HugePages_Rsvd:      0
Hugepagesize:     2048 kB
[root@oraclelinux13 ~]#


Please let me know if you require any additional information. Thanks in advance !
0
ashsysad
Asked:
ashsysad
  • 3
  • 2
  • 2
2 Solutions
 
Duncan RoeSoftware DeveloperCommented:
Please post he output from free run in a command window.
0
 
ashsysadAuthor Commented:
@duncan_roe,  Here's the free -m output. Per our observation, the server has lot of Memory but it isn't being used by applications.


# free -m
             total       used       free     shared    buffers     cached
Mem:         64816       2371      62444          0        109       1817
-/+ buffers/cache:        444      64371
Swap:        99002          0      99002
0
 
Daniel McAllisterPresident, IT4SOHO, LLCCommented:
Using absolutely ZERO swap space is highly unusual...
You clearly have swap space MOUNTED, but it is equally clear that it has been disabled.

1) Check your fstab (/etc/fstab) and make sure the 4th column entry is set to defaults (with no other options)...

 2) execute the command (as root): swapon -a

Even if general programs don't NEED swap, the kernel usually will USE swap to augment disk I/O buffering is a significant amount of swap is available.

Good luck -- I hope this helps!

Dan
IT4SOHO
0
Concerto Cloud for Software Providers & ISVs

Can Concerto Cloud Services help you focus on evolving your application offerings, while delivering the best cloud experience to your customers? From DevOps to revenue models and customer support, the answer is yes!

Learn how Concerto can help you.

 
Duncan RoeSoftware DeveloperCommented:
You are using a tiny fraction of available memory. To some extent I am not surprised that no swap is used - as long as free stays above zero there is no reason to page out (swap). /proc/meminfo shows HighTotal (65984704) less than MemTotal (66371596) so there has always been some free memory.
On the other hand, you are getting processes terminated. This points the finger at swap. It's my understanding that since free shows a line for swap, then swapon is already in effect.

I am starting to wonder whether perhaps your swap disk is too big. Years ago, there was a limit of 2GB for any one swap partition. I thought I had a vague recollection that was increased to 64GB. But I don't remember where the old limit was documented, and I can't find any documented limit now. Nevertheless, can you try this experiment: create a 1GB swap disk in the file system and activate it. Create several if you like - I created 2
11:19:35# mkdir /root/tests
11:19:39# cd /root/tests
11:19:43# dd if=/dev/zero of=swapdisk bs=4096 count=$[1024*256]
262144+0 records in
262144+0 records out
1073741824 bytes (1.1 GB) copied, 22.5458 s, 47.6 MB/s
11:21:55# ls -l
total 1049604
-rw-r--r-- 1 root root 1073741824 Aug 30 11:21 swapdisk
11:22:21# mkswap swapdisk   
Setting up swapspace version 1, size = 1048572 KiB
no label, UUID=c4cb32be-2cb6-472b-b29b-eee4fced7ee6
11:22:44# swapon swapdisk 
11:23:16# free
             total       used       free     shared    buffers     cached
Mem:       8307184    2336816    5970368          0     704952    1297168
-/+ buffers/cache:     334696    7972488
Swap:      1048572          0    1048572
11:23:19# dd if=/dev/zero of=swapdisk2 bs=4096 count=$[1024*256]
262144+0 records in
262144+0 records out
1073741824 bytes (1.1 GB) copied, 22.0973 s, 48.6 MB/s
11:28:28# mkswap swapdisk2
Setting up swapspace version 1, size = 1048572 KiB
no label, UUID=042dc953-a0ca-41b9-bce4-c8e4dbaf5d6c
11:28:47# swapon swapdisk2
11:28:56# free
             total       used       free     shared    buffers     cached
Mem:       8307184    3352688    4954496          0     656448    2345752
-/+ buffers/cache:     350488    7956696
Swap:      2097144          0    2097144

Open in new window

You might want to swapoff your hard drive, seeing it doesn't seem to be doing any good.

An extra thing - do you have anything mounted on tmpfs? That will use RAM and attempt to use swap.
0
 
Daniel McAllisterPresident, IT4SOHO, LLCCommented:
duncan_roe above raises an excellent point -- but it led me to a different pair of questions:

1) If you have little or no experience with Perl, I suggest you read this page (with a fix) HERE

I suspect that you're seeing an APPLICATION error, not a system error...

 - which leads me to my other question -

2) Are you running 32-bit or 64-bit software??

It appears as though you're running 64-bit Linux... but did you perchance maybe install a 32-bit application (is your perl 64-bit??)

Your system shows 16GB physical RAM, and nearly 100GB of swap space (clearly you thought adding swap would solve this! :-))

But if your application (perl?) is 32-bit, it will never be able to resolve more than 4GB of address space.

Similarly, if your system limits are limiting perl too much, you may be getting the error from those limits (see link above for reference).

I hope one of these makes a difference!

Dan
IT4SOHO
0
 
ashsysadAuthor Commented:
Hello All,  Thanks for your inputs and suggestions. Finally we could found out the solution for this problem, which haunted us for the past 2 weeks.

In brief, we limited the Physical Memory to 16 GB. We found that OOM killer was getting invoked as low memory was getting used fully. Kernel uses low memory to track all memory allocations those doesn't come under 4 GB limitations.  


# cat /proc/meminfo
MemTotal:     16302092 kB
MemFree:      13930024 kB
Buffers:        243056 kB
Cached:        1348296 kB
SwapCached:          0 kB
Active:         913220 kB
Inactive:      1123888 kB
HighTotal:    15653056 kB
HighFree:     13778584 kB
LowTotal:       649036 kB
LowFree:        151440 kB

SwapTotal:    101378140 kB
SwapFree:     101378140 kB
Dirty:             248 kB
Writeback:           0 kB
AnonPages:      445640 kB
Mapped:          29252 kB
Slab:            75756 kB
PageTables:      11100 kB
NFS_Unstable:        0 kB
0
 
ashsysadAuthor Commented:
My solution fixed the problem,
0

Featured Post

Upgrade your Question Security!

Your question, your audience. Choose who sees your identity—and your question—with question security.

  • 3
  • 2
  • 2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now