当前位置:网站首页>Yarn organizational structure
Yarn organizational structure
2022-07-06 09:34:00 【Prism 7】
Table of contents title
1. YARN Cluster architecture and working principle
YARN The basic design idea is to MapReduce V1 Medium JobTracker Split into two separate services :ResourceManager and ApplicationMaster.
- ResourceManager( Responsible for resource management and allocation of the whole system ): RM Is a global Explorer , Responsible for resource management and allocation of the whole system , It's mainly made up of two parts : Scheduler (Scheduler) And Application Manager (Application Manager).
Scheduler according to capacity 、 Restrictions such as queues , Allocate resources in the system to running applications , At guaranteed capacity 、 On the premise of fairness and service level , Optimize cluster resource utilization , Let all resources be fully utilized
The application manager is responsible for managing all applications in the whole system , Including application submission 、 Negotiate resources with the scheduler to start ApplicationMaster、 monitor ApplicationMaster Run state and restart it in case of failure
- ApplicationMaster ( Responsible for the management of a single application ): An application submitted by a user corresponds to a ApplicationMaster, Its main functions are :
- And RM The scheduler negotiates to obtain resources , Resources to Container Express .
- Further assign the obtained tasks to internal tasks .
- And NM Communicate to start / Stop task .
- Monitor the status of all internal tasks , And re apply for resources for the task to restart the task when the task fails to run
- NodeManager: NodeManager Is the resource and task manager on each node , One side , It regularly reports to RM Report the resource usage and each Container Operating state ; On the other hand , He receives and processes information from AM Of Container Start and stop requests .
- Container: Container yes YARN Resource abstraction in , Encapsulates various resources . An application will be assigned a Container, This application can only use this Container Resources described in .Container It is a division unit of dynamic resources , Better use of resources
2. YARN Task submission process
Direction YARN After submitting an application ,YARN The program will be run in two stages : One is to start ApplicationMaster; Second, by ApplicationMaster Create application , Then apply for resources for him , Operation of monitoring program , Until the end .
Specific steps :
(1) The user to YARN Submit an application , And designate ApploicationMaster Program ;
(2)ResourceManager Assign a Container, And the corresponding NodeManager Communications , In this Container Start in ApplicationMaster.
(3)ApplicationMaster towards ResourceManager register , Then split the task and assign it to the internal , Apply for resources for each split task , Then monitor the operation of these tasks , Know the end .
(4)ApplicationMaster Use polling to RM Application resources .
(5)AM After applying for resources , With the corresponding NodeManager Communications , To start the task .
(6) After the mission starts , Each task will report to AM Report your status and progress , So that when the task fails ,AM You can reapply for the resource restart task .
(7) When the task is completed ,AM towards RM Log off and close yourself .
3. YARN Three resource scheduling models based on
stay Yarn There are three schedulers to choose from :FIFO Scheduler ,Capacity Scheduler,Fair Scheduler.
Apache Version of hadoop The default is Capacity Scheduler Dispatch mode .CDH The default version is Fair Scheduler Dispatch mode
FIFO Scheduler( First come, first served ):
FIFO Scheduler Queue applications in the order they are submitted , This is a first in, first out line , In resource allocation , First, allocate resources to the application at the top of the queue , Wait for the top application requirements to be met before the next allocation , And so on .
FIFO Scheduler Is the simplest and easiest to understand scheduler , It doesn't need any configuration , But it doesn't apply to shared clusters . Large applications may take up all cluster resources , This causes other applications to be blocked , For example, there is a big task being carried out , It takes up all the resources , Submit another small task , Then this small task will be blocked all the time .
Capacity Scheduler( Capacity / Capability scheduler ):
about Capacity Scheduler , There is a dedicated queue for small tasks , But setting up a queue for small tasks will occupy a certain amount of cluster resources in advance , This leads to the execution time of big tasks lagging behind the use of FIFO Time of scheduler .
Fair Scheduler( Fair scheduler ):
stay Fair In scheduler , We do not need to occupy certain system resources in advance ,Fair The scheduler will run for all job Dynamically adjust system resources .
such as : When the first big job When submitting , This is the only one job Running , At this point it gets all the cluster resources ; When the second small task is submitted ,Fair The scheduler will allocate half the resources to this small task , Let these two tasks share cluster resources fairly .
It should be noted that , stay Fair In scheduler , There will be a delay from the second task submission to resource acquisition , Because it needs to wait for the first task to release the occupied Container. After small tasks are executed, they will also release the resources they occupy , The big task gets all the system resources . The end result is Fair The scheduler can not only achieve high resource utilization, but also ensure that small tasks can be completed in time
边栏推荐
- QML type: overlay
- 五月刷题27——图
- CSP student queue
- Processes of libuv
- Redis' bitmap
- Redis之cluster集群
- Global and Chinese market of linear regulators 2022-2028: Research Report on technology, participants, trends, market size and share
- 数据建模有哪些模型
- MapReduce instance (VI): inverted index
- Blue Bridge Cup_ Single chip microcomputer_ PWM output
猜你喜欢

Advanced Computer Network Review(4)——Congestion Control of MPTCP

Reids之缓存预热、雪崩、穿透

Redis connection redis service command

Le modèle sentinelle de redis

Full stack development of quartz distributed timed task scheduling cluster

发生OOM了,你知道是什么原因吗,又该怎么解决呢?

Redis之五大基础数据结构深入、应用场景

Nacos installation and service registration

Blue Bridge Cup_ Single chip microcomputer_ Measure the frequency of 555

MapReduce instance (VI): inverted index
随机推荐
Master slave replication of redis
六月刷题01——数组
五月集训总结——来自阿光
Redis之五大基础数据结构深入、应用场景
五层网络体系结构
Kratos战神微服务框架(三)
五月刷题01——数组
O & M, let go of monitoring - let go of yourself
五月刷题02——字符串
Redis分布式锁实现Redisson 15问
五月刷题03——排序
MapReduce instance (x): chainmapreduce
Global and Chinese markets for modular storage area network (SAN) solutions 2022-2028: Research Report on technology, participants, trends, market size and share
QML type: overlay
QML type: locale, date
Redis' bitmap
Oom happened. Do you know the reason and how to solve it?
Use of activiti7 workflow
【深度学习】语义分割:论文阅读(NeurIPS 2021)MaskFormer: per-pixel classification is not all you need
英雄联盟轮播图自动轮播