当前位置:网站首页>Distributed resource management and task scheduling framework yarn
Distributed resource management and task scheduling framework yarn
2022-07-05 22:22:00 【m0_ sixty-two million two hundred and ninety-five thousand nine】
stay Hadoop1.x in MapReduce yes Master/Slave structure .1 individual JobTracker Take more than one TaskTracker, We call it MRv1.
JobTracker The main function
Resource management
Task scheduling
T askTracker The main function
Perform tasks , Respond to JobTracker command
Report the heartbeat
The main problem
JobTracker A single point of failure , If it hangs up , The whole system doesn't work
JobTracker Too much load
Support only MR Computing framework , Suitable for batch processing 、 Disk based computing
There is no good decoupling design between resources and computing , A cluster can only use one computing framework
Yarn characteristic
Decoupling design of resource management and computing framework , A cluster resource is shared among the upper computing frameworks
Data sharing within the cluster is consistent , Data no longer needs to be copied and transferred between clusters , Achieve sharing and interoperability
Avoid single point of failure 、 Cluster resource expansion has been reasonably solved
Yarn Operation flow
ResourceManger( Resource management ):
ResourceScheduler( Resource scheduling )
AplicationsManger( Process management )
NodeManger Node resource management
ApplicationMaster task management
Task Where the task runs
Client To the end ResourceManger Submit the assignment , These include applicationMaster Program , start-up applicationMaste The order of
ResourceManger Assign the first Container, That is, allocate a container , And corresponding NodeManger signal communication , send NodeManger Start the job in this container ApplicationMaster
NodeManger Start a Container function ApplicationMaster
ApplicationMaster First of all to REsourceManger register , So users can go through ResourceManger To query the running status of the job ,ApplicationMaster To ResourceManger Apply for various tasks and resources , And monitor the running status of the task
ApplicationMaster Get the resources and the corresponding NodeManger signal communication , Start the task
NodeManger receive ApplicationMaster command , start-up Contain Mission
each Container adopt RPC towards ApplicationMaster Report task status and progress , therefore ApplicationMaster You can restart the task when it fails
After job completion ,ApplicationMaster towards ResourceMangerous Apply for cancellation and close yourself
ResourceManger monitor NodeMangerous and ApplicationMaster
NodeManger Cyclical ResourceManger Report resource usage , And running state
ApplicationMaster Monitoring task , You can make NodeManger Restart the mission
What an application needs Container There are two main categories , as follows :
(1) function ApplicationMaster Of Container: This is from ResourceManager( To the internal resource scheduler ) Application and start up , When a user submits an application , A unique ApplicationMaster Resources needed ;
(2) To run all kinds of tasks Container: This is from ApplicationMaster towards ResourceManager Applied , And by the ApplicationMaster And NodeManager Communication to start .
The above two categories Container It may be on any node , Their positions are usually random , namely ApplicationMaster It may run on the same node as the tasks it manages .
Scheduling strategy
FIFO fifo
Capacity Scheduler( Container scheduler ) General runway , Don't occupy
Fair Scheduler( Fair scheduler ) Allow others to occupy , But when you use it yourself , It will make others lose some data
Yarn shell Instructions
View version information :yarn version
Use yarn Submission of orders jar package :
yarn jar jarName mainClassPath -Dk1=v1 -Dk2=v2 inputPath outputPath
View all application List information :yarn application -list
Kill the designated application, Use command :yarn application kill app-id
see yarn Current resource usage of :yarn top
边栏推荐
- Basic grammar of interview (Part 1)
- Advantages and disadvantages of the "Chris Richardson microservice series" microservice architecture
- Character conversion PTA
- 119. Pascal‘s Triangle II. Sol
- Comment développer un plug - in d'applet
- Concurrency control of performance tuning methodology
- Oracle is sorted by creation time. If the creation time is empty, the record is placed last
- Sparse array [matrix]
- MySQL actual combat 45 lecture learning (I)
- Common interview questions of JVM manufacturers
猜你喜欢
opencv 判断点在多边形内外
Advantages and disadvantages of the "Chris Richardson microservice series" microservice architecture
Learning of mall permission module
MySQL disconnection reports an error MySQL ldb_ exceptions. OperationalError 4031, The client was disconnected by the server
Blocking of concurrency control
2022-07-05: given an array, you want to query the maximum value in any range at any time. If it is only established according to the initial array and has not been modified in the future, the RMQ meth
EasyCVR集群部署如何解决项目中的海量视频接入与大并发需求?
Distance entre les points et les lignes
Livelocks and deadlocks of concurrency control
Serializability of concurrent scheduling
随机推荐
What if win11 is missing a DLL file? Win11 system cannot find DLL file repair method
Comment développer un plug - in d'applet
Text组件新增内容通过tag_config设置前景色、背景色
Countdown to 92 days, the strategy for the provincial preparation of the Blue Bridge Cup is coming~
Leetcode simple question: the minimum cost of buying candy at a discount
Database tuning solution
等到产业互联网时代真正发展成熟,我们将会看待一系列的新产业巨头的出现
Server optimization of performance tuning methodology
"Chris Richardson microservices series" uses API gateway to build microservices
2022-07-05: given an array, you want to query the maximum value in any range at any time. If it is only established according to the initial array and has not been modified in the future, the RMQ meth
Advantages and disadvantages of the "Chris Richardson microservice series" microservice architecture
Two stage locking protocol for concurrency control
Go语言学习教程(十五)
A substring with a length of three and different characters in the leetcode simple question
2022-07-05:给定一个数组,想随时查询任何范围上的最大值。 如果只是根据初始数组建立、并且以后没有修改, 那么RMQ方法比线段树方法好实现,时间复杂度O(N*logN),额外空间复杂度O(N*
Performance monitoring of database tuning solutions
700. Search in a Binary Search Tree. Sol
opencv 判断点在多边形内外
Installation of VMware Workstation
Common interview questions of JVM manufacturers