当前位置:网站首页>Basic introduction of yarn and job submission process
Basic introduction of yarn and job submission process
2022-07-07 10:32:00 【The story is written in my heart-】
List of articles :
One 、YARN The basic theory of
1) About YARN Introduction to :
YARN It is a resource scheduling platform , Responsible for providing server computing resources for computing programs , Equivalent to a distributed operating system platform , and MapReduce Etc. is equivalent to the application running on the operating system .
2)hadoop1.x in YARN Deficiency :
- JobTracker It is the centralized processing of cluster transactions , There is a single point of failure
- JobTracker There are too many tasks to complete , It is necessary to maintain job The state of must be maintained job Of task The state of , Cause excessive resource consumption
- stay TaskTracker End , use Map/Reduce Task The representation of resources is too simple , Not considered CPU. Memory , Other resources , Will need two large consumption Task Schedule together , It's easy to show up OOM.
- Force resources into Map/Reduce Slot, When only MapTask when ,TeduceSlot Out-of-service ; When only ReduceTask when ,MapSlot Out-of-service , It is easy to cause insufficient utilization of resources .
3)hadoop2.x in YARN New features :
- MRv2 The most basic idea is to put the original JobTracker Main resource management and Job Dispatch / The monitoring function is separated as two separate daemons .
- There is a global ResourceManager(RM) And each Application There is one ApplicationMaster(AM),Application amount to MapReduce Job perhaps DAG jobs.ResourceManager and NodeManager(NM) It forms the basic data calculation framework .ResourceManager Coordinate the resource utilization of the cluster , whatever Client Or running applicatitonMaster Want to run Job perhaps Task All have to RM Apply for certain resources .ApplicatonMaster It's a framework specific library , about MapReduce The framework has its own AM Realization , Users can also implement their own AM, During operation ,AM Will be with NM Start and monitor together Tasks.
4)YARN Role introduction in :
ResourceManager:ResoueceMananer It is based on the application's demand for cluster resources yarn The main control node of the cluster , Responsible for coordinating and managing the entire cluster , Different types of applications submitted by corresponding users , analysis 、 Dispatch 、 Monitoring and so on .ResourceManager Will be for each one application Start a MRappmaster, also MRappmaster Scattered in various places nodemanager On .
ResourceManager It's made up of two parts :
- Application Manager (ApplicationsManager, ASM): Manage and monitor all applications MRappmaster, Start the application MRappmaster, as well as MRappmaster Failed restart
- Hadoop There are three main types of schedulers : Scheduler (Scheduler):
FIFO Scheduler: First in, first out scheduler : Priority submission , priority , Waiting for the submission later ( The production environment doesn't use );
Capacity Scheduler: Capacity scheduler : Allow to create multiple task queues , Each queue uses a portion of all resources . Multiple task queues can be executed simultaneously . But a queue is still FIFO .(Hadoop 2.7.2 The default scheduler );
Fair Scheduler: Fair scheduler : The first program can take up the resources of other queues when it starts (100% Occupy ), When other queues have tasks submitted , The queue that occupies the resource needs to return the resource to the task . When we return resources , The efficiency is slow .(CDH Version of yarn The scheduler defaults to )
NodeManager:
Nodemanager yes yarn The provider of real resources in the cluster , It is also the provider of the container that actually executes the application , Monitor application resources (cpu、 The Internet 、IO、 Memory ). And through the heartbeat to the main node of the cluster ResourceManager Report and update your health . At the same time, it will also supervise container Life cycle management of , Monitor each container Resources of
MRAppMaster: For the current job Of mapTask and reduceTask towards ResourceManager Application resources 、 Monitor current job Of mapTask and reduceTask Health and progress of 、 For failed MapTask and reduceTask restart 、 Responsible for mapTask and reduceTask Recycling of resources .
Container:Container It's a container , An abstract logical resource unit . The container is made of ResourceManager Scheduler Services are composed of dynamically allocated resources , It includes a certain amount of cpu、 The Internet 、IO、 Memory ,MapReduce All of the procedures Task All are executed in one container .
5)YARN Resource scheduling in :
- The client submits the calculation task to resourceManager(hadoopxx.jar)
- resourceManager Will start a on a node container, Run one of them MRappmaster
- MRappmaster towards resourceManager Apply for resource operation line maptask and reducetask
- resourceManager towards MRAPPmaster Back to run maptask and reducetask The node of
- MRAPPmaster Go to the corresponding node and start a container Run in it maptask and reudcetask
- MRappmaster monitor maptask perhaps reducetask Health of
- nodemanger After running maptask perhaps reducetask after , towards MRappmaster Apply to cancel yourself , Release resources
- MRappmaster towards resourcemanager Write off yourself , Release resources .
Two 、YARN Of job Submission process :
Client to resourcemanager Submit job Running request (hadoop jar xxxx.jar)
Resourcemanager Inspection , When there is no problem , Return a shared resource path and JobID
The client puts the shared resources into the shared path :(/tmp/hadoop-yarn/staging/hadoop/.staging/job_1539740094604_0002/)
Job.jar Need to run jar package , Rename it to job.jar
Job.split Slice information (FlieInputFormat—getSplits List)
Job.xml Profile information ( Some of the columns job.setxxxx())Client to resourcemanager Feedback: the shared resources have been placed , Conduct job True submission of
resourceManager For this job Assign a node and start on it MRAPPmaster Mission
resourceManager Go to the corresponding node and start a container Then start mrappmaster
MRappmaster Go to the shared resource path to download resources ( Mainly split、job)
MRappmater Yes job To initialize , Generate a job workbook ,job My workbook records maptask and reduce Operation progress and status
MRappmaster towards resourcemanager apply maptask and reducetask Running resources , First hair maptask And then reducetask
resourcemanager towards MRAPPmaster return maptask and reduce Resource node ( When returning to the node , There is the principle of proximity , Give priority to the current maptask The actual node of the slice being processed , Data processing can be localized . If it is multi replica, it is on any node of the multi replica . and reducetask The task starts on any node that is not busy )
MRAPPmaster Go to the corresponding node and start a container, And then in container Start in maptask Mission
maptask The task downloads the corresponding resources under the corresponding shared resource path ( Running jar package )
maptask Task start , And to MRAPPmaster Report their operation status and progress
When you have one maptask After the task is completed ,reduce It starts container Then start at startup reduce Mission , But here reducetask Only do data pulling , No calculations
reduceTASK The task downloads the corresponding resources from the corresponding shared resource path ( Running jar package ), When all maptask When the task is finished , start-up reduce Task to calculate
When maptask Or is it reducetask After the task runs , will MRAPPmaster Apply to cancel yourself , Release resourcesWhen application After the task is completed ,MRAPPmaster Will send to resourcemanager Apply to cancel yourself , Release resources .
边栏推荐
猜你喜欢
MONAI版本更新到 0.9 啦,看看有什么新功能
原型与原型链
String formatting
Socket通信原理和实践
【实战】霸榜各大医学分割挑战赛的Transformer架构--nnFormer
【HigherHRNet】 HigherHRNet 详解之 HigherHRNet的热图回归代码
Deeply analyze the main contents of erc-4907 agreement and think about the significance of this agreement to NFT market liquidity!
Elegant controller layer code
Application of OpenGL gllightfv function and related knowledge of light source
串口通讯继电器-modbus通信上位机调试软件工具项目开发案例
随机推荐
Mendeley--免费的文献管理工具,给论文自动插入参考文献
ArrayList线程不安全和解决方案
使用 load_decathlon_datalist (MONAI)快速加载JSON数据
Guide de signature du Code Appx
使用U2-Net深层网络实现——证件照生成程序
IO模型复习
JMeter about setting thread group and time
Socket通信原理和实践
About hzero resource error (groovy.lang.missingpropertyexception: no such property: weight for class)
路由器开发知识汇总
Vs code specifies the extension installation location
Guid primary key
2022.7.3DAY595
[email protected] can help us get the log object quickly
5个chrome简单实用的日常开发功能详解,赶快解锁让你提升更多效率!
Pdf document signature Guide
枪出惊龙,众“锁”周之
leetcode-304:二维区域和检索 - 矩阵不可变
浅谈日志中的返回格式封装格式处理,异常处理
HAL库配置通用定时器TIM触发ADC采样,然后DMA搬运到内存空间。