The fair scheduler is a scheduler plugin that fairly distributes an equal share of resources for jobs in the YARN cluster.
Queue assignment / placement
Queues are assigned:
- at request level (based on the queue name or the user and group name)
- at a configuration level
based on the policy
Hierarchy of queue
Queues can be arranged in a hierarchy to divide resources and configured with weights to share the cluster in specific proportions.
All queues descend from a queue named root. Available resources are distributed from the root queue recursively among the children. Applications may only be scheduled on leaf queues.
Queues can be specified as children of other queues by placing them as sub-elements of their parents in the fair scheduler allocation file.
A queue’s name starts with the names of its parents, with periods as separators.
where the root part of the name is optional.
- root.queue1 (or queue1) is the queue under the root queue
- root.parent1.queue2 (or parent1.queue2)
The Fair Scheduler allows assigning minimum shares to queues (also known as guaranteed share). This is useful for ensuring that production applications always get sufficient resources.
When a queue contains apps, it gets at least its minimum share, but when the queue does not need it, the excess is split between other running apps.
The Max number of running app.
The Fair Scheduler lets all apps run by default, but it is also possible to limit the number of running apps per user and per queue.
Limited apps will wait in the scheduler’s queue until the actual running apps finish.
A policy consists of a set of rules that are applied sequentially to classify an incoming application. Each rule either places the app into a queue, rejects it, or continues on to the next rule.
app priorities are used as weights to determine the fraction of total resources that each app should get.
- Used Resources - The sum of resources allocated to containers within the queue.
- Num Active Applications - The number of applications in the queue that have received at least one container.
- Num Pending Applications - The number of applications in the queue that have not yet received any containers.
- Min Resources - The configured minimum resources that are guaranteed to the queue.
- Max Resources - The configured maximum resources that are allowed to the queue.
- Instantaneous Fair Share - The queue’s instantaneous fair share of resources. These shares consider only actives queues (those with running applications), and are used for scheduling decisions. Queues may be allocated resources beyond their shares when other queues aren’t using them. A queue whose resource consumption lies at or below its instantaneous fair share will never have its containers preempted.
- Steady Fair Share - The queue’s steady fair share of resources. These shares consider all the queues irrespective of whether they are active (have running applications) or not. These are computed less frequently and change only when the configuration or capacity changes.They are meant to provide visibility into resources the user can expect, and hence displayed in the Web UI.
in the conf file
<property> <name>yarn.resourcemanager.scheduler.class</name> <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler</value> </property>
Two files for two scope:
- at process start: the yarn-site.xml
- on the fly: the allocation file (reloaded every 10 seconds). It lists which queues exist and their respective weights and capacities.
The reservation system in the context of the fair scheduler.
It's not enabled by default. The application can request reserved resources at runtime by specifying the reservationId during submission.
Fair Scheduler is able to dump its state periodically. It is disabled by default. The administrator can enable it by setting org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.statedump logging level to DEBUG.
Fair Scheduler state dumps can potentially generate a large amount of log data.
Fair Scheduler logs go to the Resource Manager log file by default. Uncomment the “Fair scheduler state dump” section in log4j.properties to dump the state into a separate file.