OOM Killer in syslogs- docker

Prabhin MP
Prabhin MP used Ask the Experts™
on
java invoked oom-killer: gfp_mask=0xd0, order=0, oom_score_adj=0

we are using docker swarm for deploying docker containers that run java application.   Recently the containers are getting stopped frequently and we have observed the above-metioned log in system logs.
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
nociSoftware Engineer
Distinguished Expert 2018

Commented:
OOM = Out Of Memory killer.   When the kernel is running out of memory then some "random" process is selected and killed to allow the system to continue to run (hopefully).

So you need more memory in your system. A quick "workaround" is enlarging swap space to sit out the time  until the new memory for your server has been delivered & installed. (Installing will mean downtime because you need to turn off the system to fit the memory banks).

Adding swap space will cause exchange of memory between the disks and the memory. This will cause more IO (thus delay any other disk IO),  and more latency as a program may or may not be in memory when it needs to run.  (It cannot run from swap, it needs real memory to run).  So storing something else into swap space and pulling the needed stuff from there takes a (short) wile.
Top Expert 2016

Commented:
It must also be said that if there are bugs in the app, it might go OOM however much you give it originally
Prabhin MPDevOps Engineer
Distinguished Expert 2018

Author

Commented:
I'm giving docker mem and cpu limits
Ensure you’re charging the right price for your IT

Do you wonder if your IT business is truly profitable or if you should raise your prices? Learn how to calculate your overhead burden using our free interactive tool and use it to determine the right price for your IT services. Start calculating Now!

nociSoftware Engineer
Distinguished Expert 2018
Commented:
The memory limits might cause this then.  In the case of jvm environments you may need to reduce the memory pool inside the VM to the application as well.
It won't work if you limit the docker instance to say 300MB and allow the JVM to expand to 1GB... the JVM  should keep below 300MB total as well (runtime + code + data + heap).
Fractional CTO
Distinguished Expert 2018
Commented:
All the above are potentially correct.

Likely first step...

1) One run instance of your Docker containered app.

2) Manually remove the memory constraint.

3) Run ps in a tight loop, tracking memory usage.

4) If your memory usage increases + never decreases, this means your app code must be fixed.
Scott SilvaNetwork Administrator
Commented:
And do not depend on Java garbage collect routines to clean all of this up. Make sure completed processes are killed properly, and minimize loops if possible. Test processes with a good java debugger..
Just to echo the initial response - OOM is killing your Java app because it is out of total RAM - physical plus swap.  E.g. The instance allows up to 1GB RAM, your Java app reaches 1GB and gets killed.

But if you give yourself some swap space, OOM won't kill the process.  E.g. You setup the instance with 1GB RAM, 2GB of swap.  If your Java app reaches 1.5GB, it still fits (1GB in RAM, 0.5 in swap) and OOM won't kill it.

What you may see in this case is that the app *may* slow down.  But that depends on how much of its memory needs to be accessed regularly (the working set) and that may be only 200MB out of a 1.5GB process.

Once you add enough swap space, then if the Java app actually exceeds its available heap size (set when you launch Java), you'll get a different error - an OutOfMemory exception within Java, not OOM killing the app.

If that happens, then you have a Java app that is leaking and it needs to be fixed.  But there's a good chance that's not the case here.

Doug
David FavorFractional CTO
Distinguished Expert 2018

Commented:
Scott's right on. I've seem a good bit of Java code which is flawed because the code depends on garbage collection for memory reclamation.

Best to use good coding practices, independent of language features like garbage collection.

Make sure all memory used is explicitly cleaned up... released back to system when no longer required... This will save a massive amount of long term debugging + code maintenance.
Prabhin MPDevOps Engineer
Distinguished Expert 2018

Author

Commented:
thank you all for supporting this.
David FavorFractional CTO
Distinguished Expert 2018

Commented:
You're welcome!

Good luck!

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial