当前位置:网站首页>Yarn organizational structure
Yarn organizational structure
2022-07-06 09:34:00 【Prism 7】
Table of contents title
1. YARN Cluster architecture and working principle
YARN The basic design idea is to MapReduce V1 Medium JobTracker Split into two separate services :ResourceManager and ApplicationMaster.
- ResourceManager( Responsible for resource management and allocation of the whole system ): RM Is a global Explorer , Responsible for resource management and allocation of the whole system , It's mainly made up of two parts : Scheduler (Scheduler) And Application Manager (Application Manager).
Scheduler according to capacity 、 Restrictions such as queues , Allocate resources in the system to running applications , At guaranteed capacity 、 On the premise of fairness and service level , Optimize cluster resource utilization , Let all resources be fully utilized
The application manager is responsible for managing all applications in the whole system , Including application submission 、 Negotiate resources with the scheduler to start ApplicationMaster、 monitor ApplicationMaster Run state and restart it in case of failure
- ApplicationMaster ( Responsible for the management of a single application ): An application submitted by a user corresponds to a ApplicationMaster, Its main functions are :
- And RM The scheduler negotiates to obtain resources , Resources to Container Express .
- Further assign the obtained tasks to internal tasks .
- And NM Communicate to start / Stop task .
- Monitor the status of all internal tasks , And re apply for resources for the task to restart the task when the task fails to run
- NodeManager: NodeManager Is the resource and task manager on each node , One side , It regularly reports to RM Report the resource usage and each Container Operating state ; On the other hand , He receives and processes information from AM Of Container Start and stop requests .
- Container: Container yes YARN Resource abstraction in , Encapsulates various resources . An application will be assigned a Container, This application can only use this Container Resources described in .Container It is a division unit of dynamic resources , Better use of resources
2. YARN Task submission process
Direction YARN After submitting an application ,YARN The program will be run in two stages : One is to start ApplicationMaster; Second, by ApplicationMaster Create application , Then apply for resources for him , Operation of monitoring program , Until the end .
Specific steps :
(1) The user to YARN Submit an application , And designate ApploicationMaster Program ;
(2)ResourceManager Assign a Container, And the corresponding NodeManager Communications , In this Container Start in ApplicationMaster.
(3)ApplicationMaster towards ResourceManager register , Then split the task and assign it to the internal , Apply for resources for each split task , Then monitor the operation of these tasks , Know the end .
(4)ApplicationMaster Use polling to RM Application resources .
(5)AM After applying for resources , With the corresponding NodeManager Communications , To start the task .
(6) After the mission starts , Each task will report to AM Report your status and progress , So that when the task fails ,AM You can reapply for the resource restart task .
(7) When the task is completed ,AM towards RM Log off and close yourself .
3. YARN Three resource scheduling models based on
stay Yarn There are three schedulers to choose from :FIFO Scheduler ,Capacity Scheduler,Fair Scheduler.
Apache Version of hadoop The default is Capacity Scheduler Dispatch mode .CDH The default version is Fair Scheduler Dispatch mode
FIFO Scheduler( First come, first served ):
FIFO Scheduler Queue applications in the order they are submitted , This is a first in, first out line , In resource allocation , First, allocate resources to the application at the top of the queue , Wait for the top application requirements to be met before the next allocation , And so on .
FIFO Scheduler Is the simplest and easiest to understand scheduler , It doesn't need any configuration , But it doesn't apply to shared clusters . Large applications may take up all cluster resources , This causes other applications to be blocked , For example, there is a big task being carried out , It takes up all the resources , Submit another small task , Then this small task will be blocked all the time .
Capacity Scheduler( Capacity / Capability scheduler ):
about Capacity Scheduler , There is a dedicated queue for small tasks , But setting up a queue for small tasks will occupy a certain amount of cluster resources in advance , This leads to the execution time of big tasks lagging behind the use of FIFO Time of scheduler .
Fair Scheduler( Fair scheduler ):
stay Fair In scheduler , We do not need to occupy certain system resources in advance ,Fair The scheduler will run for all job Dynamically adjust system resources .
such as : When the first big job When submitting , This is the only one job Running , At this point it gets all the cluster resources ; When the second small task is submitted ,Fair The scheduler will allocate half the resources to this small task , Let these two tasks share cluster resources fairly .
It should be noted that , stay Fair In scheduler , There will be a delay from the second task submission to resource acquisition , Because it needs to wait for the first task to release the occupied Container. After small tasks are executed, they will also release the resources they occupy , The big task gets all the system resources . The end result is Fair The scheduler can not only achieve high resource utilization, but also ensure that small tasks can be completed in time
边栏推荐
- Hard core! One configuration center for 8 classes!
- How to intercept the string correctly (for example, intercepting the stock in operation by applying the error information)
- Seven layer network architecture
- Global and Chinese market for annunciator panels 2022-2028: Research Report on technology, participants, trends, market size and share
- Redis geospatial
- Connexion d'initialisation pour go redis
- Global and Chinese markets of SERS substrates 2022-2028: Research Report on technology, participants, trends, market size and share
- Mapreduce实例(四):自然排序
- 七层网络体系结构
- Mapreduce实例(六):倒排索引
猜你喜欢
为拿 Offer,“闭关修炼,相信努力必成大器
Kratos战神微服务框架(一)
Redis geospatial
MapReduce instance (VII): single table join
[Yu Yue education] reference materials of complex variable function and integral transformation of Shenyang University of Technology
数据建模有哪些模型
Reids之删除策略
Solve the problem of inconsistency between database field name and entity class attribute name (resultmap result set mapping)
Redis之发布订阅
基于WEB的网上购物系统的设计与实现(附:源码 论文 sql文件)
随机推荐
MapReduce instance (IV): natural sorting
工作流—activiti7环境搭建
Detailed explanation of cookies and sessions
Global and Chinese market of linear regulators 2022-2028: Research Report on technology, participants, trends, market size and share
基于B/S的影视创作论坛的设计与实现(附:源码 论文 sql文件 项目部署教程)
Redis之五大基础数据结构深入、应用场景
Go redis initialization connection
CSP salary calculation
【深度学习】语义分割-源代码汇总
AcWing 2456. 记事本
五月刷题03——排序
六月刷题01——数组
Connexion d'initialisation pour go redis
Kratos ares microservice framework (III)
发生OOM了,你知道是什么原因吗,又该怎么解决呢?
Global and Chinese market of AVR series microcontrollers 2022-2028: Research Report on technology, participants, trends, market size and share
Leetcode problem solving 2.1.1
Reids之删除策略
How to intercept the string correctly (for example, intercepting the stock in operation by applying the error information)
Redis' bitmap