当前位置:网站首页>Yarn capacity scheduler (ultra detailed interpretation)

Yarn capacity scheduler (ultra detailed interpretation)

2022-07-07 17:40:00 Yang Linwei

01 introduction

I wrote a blog before 《yarn introduction ( One is enough )》, You can know yarn There are three main schedulers , Respectively FIFO、Capacity Scheduler and Fair Scheduler, among Hadoop3.X The default resource scheduler is Capacity Scheduler, Let's talk about Capacity Scheduler Capacity scheduler .

02 Capacity Scheduler

2.1 Operation principle

 Insert picture description here
Capacity scheduler each queue internal first in first out , There is only one task in the queue at the same time , The parallelism of a queue is the number of queues .

Capacity Scheduler It's a hadoop Pluggable resource scheduler supported , It allows multiple tenants to share cluster resources securely , Their applications Under capacity constraints , Can allocate resources in time . Run in an operation friendly way hadoop application , At the same time, maximize throughput capacity and cluster utilization .

Capacity Scheduler The core idea provided is Queues( queue ), these Queues It is usually set by the administrator , It supports multiple queues , Each queue can be configured with a certain amount of resources , Each queue uses FIFO Scheduling strategy . In order to share resources , Provide more control and predictability ,Capacity Scheduler Support multi-level queue, To ensure that in other queues Before allowing idle resources , Resources can be in an organization's sub-queues To share .

In order to prevent The same user A user can bind multiple queues ) Your job monopolizes the resources in the queue , The scheduler will limit the resources occupied by jobs submitted by the same user :

  • First , Calculate the ratio of the number of running tasks in each queue to the number of computing resources it should share , Select a queue with the lowest ratio ( That is, the most idle );
  • secondly , According to the order of job priority and submission time , At the same time, the user resource and memory constraints are considered to sort the tasks in the queue .

Pictured above , The three queues are executed at the same time according to the sequence of tasks , such as :job11,job21 and job31 At the top of the queue , First run , It's also running in parallel .

2.2 Parameter configuration

Configuration parameters are mainly divided into 3 block , Respectively Parameters related to resource allocation 、 Limit the number of applications related parameters 、 Queue access and permission control parameters .

2.2.1 Parameters related to resource allocation

Parameters describe
capacityQueue Percent of capacity ,float type , for example 12.5. all Queue At all levels of capacity The sum must be 100. Because elastic resource allocation , If there are more free resources in the cluster ,queue Medium application It may consume more than this setting Capacity.
maximum-capacityqueue capacity The largest proportion ,float type , This value is used to limit queue Medium application The maximum elasticity of . The default is -1 Ban “ Elastic limits ”.
minimum-user-limit-percent Whenever there is a need for resources , Every queue Will be assigned to a user There is a mandatory restriction on resources , This user-limit It can be between the maximum value and the minimum value . This attribute is the minimum , Its maximum value depends on submission applications Number of users . for example : Suppose this value is 25, If there is 2 Users are here queue Submit application, Then each user can consume at most queue Resource capacity 50%; If the third user submitted application, So any one user The used resource capacity cannot exceed queue Of 33%;4 One or more users participate , Then each user's resource usage will not exceed queue Of 25%. The default value is 100, Indicates that there is no user resource restriction .
user-limit-factorqueue Multiple of capacity , Used to set up a user You can get more resources . The default value is 1, It means a user The resource capacity obtained cannot exceed queue Configured capacity, No matter how many free resources the cluster has . This value is float type .[ No more than maximum-capacity]

2.2.2 Limit the number of applications related parameters

Parameters describe
maximum-applications The maximum number of applications in a cluster or queue that are both waiting and running , This is a strong limitation , Once the number of applications in the cluster exceeds the upper limit , Subsequent applications submitted will be rejected , The default value is 10000. The upper limit of the number of all queues can be reached through the parameter yarn.scheduler.capacity.maximum-applications Set up ( It can be regarded as the default value ), And a single queue can pass the parameter yarn.scheduler.capacity…maximum-applications Set a value that suits you
maximum-am-resource-percent Used to run applications in a cluster ApplicationMaster The upper limit of resource proportion , This parameter is usually used to limit the number of active applications . The parameter type is floating point , The default is 0.1, Express 10%. Of all queues ApplicationMaster The upper limit of resource proportion can be determined through the parameter yarn.scheduler.capacity. maximum-am-resource-percent Set up ( It can be regarded as the default value ), And a single queue can pass the parameter yarn.scheduler.capacity… maximum-am-resource-percent Set a value that suits you .

2.2.3 Queue access and permission control parameters

Parameters describe
state The queue status can be STOPPED perhaps RUNNING, If a queue is in STOPPED state , Users cannot submit applications to this queue or its sub queues , Allied , If ROOT Queue in STOPPED state , Users cannot submit applications to the cluster , But the running application can still run normally , So that the queue can exit gracefully .
acl_submit_applications Limit what Linux user / User groups can submit applications to a given queue . It should be noted that , The attribute has inheritance , That is, if a user can submit an application to a queue , Then it can submit applications to all its sub queues . When configuring this property , Between users or user groups “,” Division , Users and user groups are separated by spaces , such as “user1, user2 group1,group2”.
acl_administer_queue Specify an administrator for the queue , The administrator can control all applications of the queue , For example, kill any application . Again , This property is inherited , If a user can submit an application to a queue , Then it can submit applications to all its sub queues

03 Configuration case

3.1 Specify the scheduler

First of all we need to yarn-site.xml Configure the specified scheduler :

<property>
    <description>The class to use as the resource scheduler.</description>
    <name>yarn.resourcemanager.scheduler.class</name>
<value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler</value>
</property>

3.2 To configure Queues

Be careful : To configure Queue stay capacity-scheduler.xml You can configure it in the library .

CapacityScheduler One has been predefined queue( namely root):

  • Everything in the system queue All are root queue Child nodes of ;
  • The rest of the queues The list can be through “yarn.scheduler.capacity.root.queues” It is specified in , Every queue The name is comma “,” Division ;
  • CapacityScheduler It uses a method called “queue path” The concept of “ multistage queue”,“queue path” It's a queue Full path of hierarchy , And in “root” start , Path with “.” As a divider .

A designation queue The child nodes of can pass “yarn.scheduler.capacity..queues” Style to define . Child nodes do not inherit directly from parent nodes properties, Unless otherwise stated . for example ,root queue Yes a,b,c Three child nodes , as well as a and b They have their own sub-queue.

stay Cloudera Manager page , Get into yarn To configure , Search for “shcedule”, choice “ Capacity scheduler configuration advanced configuration code snippet ( Safety valve )”, Write content , preservation :
 Insert picture description here

remarks : If at runtime , Added queue Or modified ACLs, You can refresh according to the page prompt . But delete Queue Is not supported , You need to restart the standby and active ResourceManager Role makes configuration effective .

The full configuration is as follows , Comments added :

<?xml version="1.0" encoding="UTF-8"?>
<configuration> 
  
  <!-- root Which sub queues are in the queue -->
  <property> 
    <name>yarn.scheduler.capacity.root.queues</name>  
    <value>default,wa,yq</value> 
  </property>  
   
    <!-- root Percentage of capacity occupied by the queue -->
  <property> 
    <name>yarn.scheduler.capacity.root.capacity</name>  
    <value>100</value> 
  </property>  
  
  <!--  by root The queue specifies an administrator , The administrator can control all applications of the queue , For example, kill any application  -->
  <property> 
    <name>yarn.scheduler.capacity.root.acl_administer_queue</name>  
    <value>admin</value> 
  </property>  

<!--  Limit what admin Users can contact root Submit the application in the queue  -->
  <property> 
    <name>yarn.scheduler.capacity.root.acl_submit_applications</name>  
    <value>admin</value> 
  </property>

 <!-- root In line default Percentage of capacity occupied by the queue   The sum of the capacities of all sub queues must be equal to 100-->
  <property> 
    <name>yarn.scheduler.capacity.root.default.capacity</name> 
    <value>30</value> 
  </property>  
 
 <!-- root In line default The maximum value of the capacity percentage occupied by the queue -->
  <property> 
    <name>yarn.scheduler.capacity.root.default.maximum-capacity</name>  
    <value>35</value> 
  </property>  

  <!-- root In line wa Percentage of capacity occupied by the queue   The sum of the capacities of all sub queues must be equal to 100-->
  <property> 
    <name>yarn.scheduler.capacity.root.wa.capacity</name>  
    <value>45</value> 
  </property>  
  
 <!-- root In line wa The maximum value of the capacity percentage occupied by the queue -->
  <property> 
    <name>yarn.scheduler.capacity.root.wa.maximum-capacity</name>  
    <value>50</value> 
  </property>  

 <!-- root In line yq Percentage of capacity occupied by the queue   The sum of the capacities of all sub queues must be equal to 100-->
<property> 
    <name>yarn.scheduler.capacity.root.yq.capacity</name>  
    <value>25</value> 
  </property>  
  
 <!-- root In line yq The maximum value of the capacity percentage occupied by the queue -->
  <property> 
    <name>yarn.scheduler.capacity.root.yq.maximum-capacity</name>  
    <value>30</value> 
  </property> 

<!--  by root Under the queue wa The queue specifies an administrator , The administrator can control all applications of the queue , For example, kill any application  --> 
  <property> 
    <name>yarn.scheduler.capacity.root.wa.acl_administer_queue</name>  
    <value>admin,user01</value> 
  </property>  
  
<!--  Limit what admin Users can contact root Queue wa Queue submission Application  -->
  <property> 
   <name>yarn.scheduler.capacity.root.wa.acl_submit_applications</name>  
    <value>admin,user01</value> 
  </property>  
  
<!--  by root Under the queue yq The queue specifies an administrator , The administrator can control all applications of the queue , For example, kill any application  --> 
  <property> 
    <name>yarn.scheduler.capacity.root.yq.acl_administer_queue</name>  
    <value>admin,user02</value> 
  </property>  
  
<!--  Limit what admin Users can contact root Queue yq Queue submission Application  -->
  <property> 
   <name>yarn.scheduler.capacity.root.yq.acl_submit_applications</name>  
    <value>admin,user02</value> 
  </property>
  
    <!--  by Job When allocating resources , What strategies are used to calculate  -->  
   <property> 
    <name>yarn.scheduler.capacity.resource-calculator</name>  
	<value>org.apache.hadoop.yarn.util.resource.DominantResourceCalculator</value> 
  </property>  
</configuration>

After the above configuration takes effect, you can enter Web UI Page view , Whether the queue setting is correct , As shown below :
 Insert picture description here

Be careful : Of all queues capacity The sum of capacity is 100%

04 At the end of the article

This article mainly explains YARN Capacity Scheduler Capacity scheduler , If you want to go deep , You can refer to the official documents :

Thank you for reading , The end of this paper !

原网站

版权声明
本文为[Yang Linwei]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/188/202207071533326523.html