当前位置:网站首页>[yarn] CDP cluster yarn configuration capacity scheduler batch allocation
[yarn] CDP cluster yarn configuration capacity scheduler batch allocation
2022-07-06 11:32:00 【kiraraLou】
One 、 Preface
It's going to be upgraded recently CDH Cluster into CDP colony ,CDH In the cluster Yarn By default, the service uses fair Scheduler ,CDP The cluster uses capacity Scheduler , We've been there before The scheduler is set unreasonably due to batch allocation , As a result, tasks are allocated to certain nodes , Make the cluster resource load extremely unbalanced .
To avoid the same problem in CDP On the cluster , We conduct research in advance . Look at using Will the scheduler also have centralized allocation . But in the process of research , There are some unexpected phenomena , Waiting for follow-up .
Two 、CDH Centralized distribution
As mentioned earlier ,CDH 5.8-Hadoop2.6.0 After version , in the light of Fair dispatch , There are several configurations as follows , For task types ( Small tasks ) Accelerate the optimization of allocation .
| Configuration name | explain |
|---|---|
yarn.scheduler.fair.max.assign | Maximum allocation : If assignmultiple by true And dynamic.max.assign by false, Then the maximum number of containers that can be allocated in a heartbeat . |
yarn.scheduler.fair.assignmultiple | Assign multiple : Whether multiple containers are allowed to be allocated in a heartbeat . |
yarn.scheduler.fair.dynamic.max.assign | If assignmultiple It's true , Whether to dynamically determine the amount of resources that a heartbeat can allocate . After opening , About half of the unallocated resources on the node will be allocated to the container in a heartbeat . Default to true . |
Through reasonable configuration , We can use centralized allocation , It will not expand the cluster load difference .
CDH How to configure a cluster is not covered here .
3、 ... and 、CDP Centralized distribution
CDP Already used in the cluster Capacity scheduling As the default scheduler , By consulting the official and Cloudera file , Find out Scheduling is also possible through heartbeat NodeManager Allocate multiple containers . The configuration is as follows :
| Configuration name | explain |
|---|---|
yarn.scheduler.capacity.per-node-heartbeat.multiple-assignments-enabled | Whether to allow in a NodeManager Allocate multiple containers in the heartbeat . Default to true . |
yarn.scheduler.capacity.per-node-heartbeat.maximum-container-assignments | If multiple-assignments-enabled by true, In a NodeManager The maximum number of containers that can be allocated in the heartbeat . The default is -1, No restrictions . |
yarn.scheduler.capacity.per-node-heartbeat.maximum-offswitch-assignments | If multiple-assignments-enabled by true, In a NodeManager The maximum that can be allocated in the heartbeat off-switch Number of containers . The default is 1, Indicates that only one off switch is allowed to be assigned in a heartbeat . |
How to configure
stay
Cloudera Managerin , Select cluster >YARN Queue manager UI service.
stay
YARNIn the queue manager window , Click the scheduler configuration tab .
stay “ Scheduler configuration ” Window

Choose
Enable Multiple Assignments Per HeartbeatCheck box to allow in aNodeManagerAllocate multiple containers in the heartbeatConfigure the following
NodeManagerHeartbeat properties :
Maximum Container Assignments Per Heartbeat: In aNodeManagerThe maximum number of containers that can be allocated in the heartbeat . Set this value to -1 This restriction will be disabled .Maximum Off-Switch Assignments Per Heartbeat: Can be in aNodeManagerThe maximum number of closed switch containers allocated in the heartbeat .
3、 ... and 、 summary
capacitySchedulers have similarfairThe heartbeat batch allocation configuration of the scheduler .- CDP colony
capacityThe scheduler enables batch allocation by default , And the assigned quantity is 100, This value needs to be reduced . - Now it's tested , It is found that the configuration does not seem to be effective , This needs to be followed up by experts .
Reference resources
https://docs.cloudera.com/cdp-private-cloud-base/7.1.7/yarn-allocate-resources/topics/yarn-set-user-limits.htmlhttps://docs.cloudera.com/cdp-private-cloud-base/7.1.7/yarn-allocate-resources/topics/yarn-configure-nm-heartbeat.htmlhttps://hadoop.apache.org/docs/r3.1.1/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html#Reviewing_the_configuration_of_the_CapacityScheduler
边栏推荐
- Antlr4 uses keywords as identifiers
- 4、安装部署Spark(Spark on Yarn模式)
- Tcp/ip protocol (UDP)
- What does usart1 mean
- vs2019 使用向导生成一个MFC应用程序
- 02 staff information management after the actual project
- Solve the problem of installing failed building wheel for pilot
- 【kerberos】深入理解kerberos票据生命周期
- L2-004 is this a binary search tree? (25 points)
- nodejs连接Mysql
猜你喜欢

02 staff information management after the actual project

Use dapr to shorten software development cycle and improve production efficiency

Rhcsa certification exam exercise (configured on the first host)

Classes in C #

Double to int precision loss

{一周总结}带你走进js知识的海洋

QT creator test

Mtcnn face detection

Valentine's Day flirting with girls to force a small way, one can learn

Windows下安装MongDB教程、Redis教程
随机推荐
{一周总结}带你走进js知识的海洋
ES6 promise object
[download app for free]ineukernel OCR image data recognition and acquisition principle and product application
L2-007 family real estate (25 points)
Kept VRRP script, preemptive delay, VIP unicast details
Rhcsa certification exam exercise (configured on the first host)
[Bluebridge cup 2020 preliminary] horizontal segmentation
【CDH】CDH/CDP 环境修改 cloudera manager默认端口7180
QT creator create button
[NPUCTF2020]ReadlezPHP
AcWing 179.阶乘分解 题解
vs2019 桌面程序快速入门
Error connecting to MySQL database: 2059 - authentication plugin 'caching_ sha2_ The solution of 'password'
double转int精度丢失问题
Base de données Advanced Learning Notes - - SQL statements
Dotnet replaces asp Net core's underlying communication is the IPC Library of named pipes
[蓝桥杯2021初赛] 砝码称重
[蓝桥杯2017初赛]方格分割
Antlr4 uses keywords as identifiers
neo4j安装教程