YARN memory allocation


I am new to Hadoop.  I have a question regarding yarn memory allocation.  If  we have 16GB memory in cluster,  we can have least 3 4GB cluster an keep 4 GB for other uses.  If a job needs 10 GB RAM, would it use 3 containers or  use one container and will start using the ram rest of the RAM ?
Who is Participating?
David FavorLinux/LXD/WordPress/Hosting SavantCommented:
YARN doesn't manage memory.

Think of YARN as a fancy CRON. All it does is schedule what runs where + when.

Unsure what you mean by 16GB in a cluster.

You'll have a set of applications running, which will either consume all resources of a machine or better, use LXD.

Most HADOOP processes crunch data, so mainly CPU usage, rather than Disk I/O.

This said, if you run LXD on all your machines, you can run your HADOOP processes in their own LXD container + run then at a low priority.

The net effect will be your production applications will serve visitor requests quickly + when production applications are dormant (like for I/O waits or low traffic) HADOOP processes will consume all free CPU cycles.

So when you refer to memory of a cluster, this has no meaning.

When you refer to total memory of a machine or container used within your cluster, this has meaning.

How your memory utilization works is dependent on many factors, like if there are any other processes running or does HADOOP "own" all of memory. With LXD, you can set memory on a per container basis or per process (in a container) basis.
All Courses

From novice to tech pro — start learning today.