当前位置:网站首页>Resource scheduling and task scheduling of spark
Resource scheduling and task scheduling of spark
2022-06-28 09:03:00 【Endless learning WangXiaoShuai】
Spark Resource scheduling and task scheduling

- Spark Resource scheduling and task scheduling process :
After starting the cluster ,Worker The node will Master Node reports resources ,Master Master the cluster resources . When Spark To submit a Application after , according to RDD The dependency between will be Application To form a DAG Directed acyclic graph . After the task is submitted ,Spark Will be in Driver Create two objects at the end :DAGScheduler and TaskScheduler,DAGScheduler It is a high-level scheduler for task scheduling , It's an object .DAGScheduler The main function of will be DAG according to RDD The wide and narrow dependencies are divided into Stage, And then put these Stage With TaskSet Form submitted to TaskScheduler(TaskScheduler It is a low-level scheduler for task scheduling , here TaskSet It's actually a collection , Inside the package is one by one task Mission , That is to say stage The degree of parallelism in task Mission ),TaskSchedule Can traverse TaskSet aggregate , Get each task After the will task Send to compute node Executor To perform ( It's actually sent to Executor Thread pool in ThreadPool To carry out ).task stay Executor The operation in the thread pool will be reported to TaskScheduler feedback , When task When execution fails , By TaskScheduler Responsible for retrying , take task Resend to Executor To carry out , Default retry 3 Time . If you try again 3 The first time still failed , So this task Where stage And failed .stage Failure is caused by DAGScheduler To try again , To resend TaskSet To TaskSchdeuler,Stage Default retry 4 Time . If you try again 4 Still fail after times , So this job And failed .job failed ,Application And failed .
TaskScheduler Can not only retry failed task, And try again straggling( backward , slow )task( That is, the execution speed is faster than other task Too slow task). If there are slow running task that TaskScheduler Will start a new task Come with this slow task Perform the same processing logic . Two task Which is the first to finish , Just one task The results of the implementation of . This is it. Spark Speculative execution mechanism . stay Spark Speculative execution is off by default . Speculative execution can be performed through spark.speculation Property to configure .
Be careful :
- about ETL The type of business to be entered into the database needs to close the speculation execution mechanism , In this way, there will be no duplicate data warehousing .
- If you encounter data skew , If speculative execution is enabled, there may always be task Restart processing the same logic , The task may be in an endless state .
- The illustration Spark Resource scheduling and task scheduling process

- Coarse grained resource requests and fine-grained resource requests
- Coarse grained resource requests (Spark)
stay Application Perform before , Apply all the resources , When the resource application is successful , The task will be scheduled , When all task After execution , This part of the resources will be released .
advantage : stay Application Perform before , All the resources have been applied for , every last task Just use the resources directly , Unwanted task Apply for resources yourself before executing ,task It's going to start soon ,task Execution is fast ,stage Execution is fast ,job That's fast. ,application Execution is fast .
shortcoming : Until the last one task Resources will only be released when the execution is completed , The resources of the cluster cannot be fully utilized .
- Fine grained resource requests (MapReduce)
Application You don't need to apply for resources before you execute , I'm going to do it directly , Give Way job Each of them task Apply for resources yourself before executing ,task Release resources when execution is complete .
advantage : The resources of the cluster can be fully utilized .
shortcoming :task Apply for resources yourself ,task Slow start ,Application The operation of the corresponding slow down .
边栏推荐
- Implementation of single sign on
- Scenario method and error recommendation method for learning basic content of software testing (2)
- 中金财富开户安全吗?怎么收费?
- 手机炒股开户安不安全?
- [reprint] STM32 GPIO type
- Using transform:scale causes the page mouse hover event to disappear
- Batch modify tables and sorting rules for fields in tables
- centos mysql5.5配置文件在哪
- 学习阿里如何进行数据指标体系的治理
- Guangzhou: new financial activities and new opportunities for enterprises
猜你喜欢

Learn how Alibaba manages the data indicator system

Webrtc advantages and module splitting

Discussion on the improvement and application of the prepayment system in the management of electricity charge and price

RMAN backup message ora-19809 ora-19804

使用transform:scale之后导致页面鼠标悬浮事件消失
![[big case] Xuecheng online website](/img/40/beec3ba567f5a372899bb58af0d05a.png)
[big case] Xuecheng online website

硬盘基本知识(磁头、磁道、扇区、柱面)

Lilda low code data large screen, leveling the threshold of data application development

What are the advantages of a differential probe over a conventional probe

Rman Backup Report Ora - 19809 Ora - 19804
随机推荐
个人究竟如何开户炒股?在线开户安全么?
How do I open an account on my mobile phone? Is it safe to open an account online now?
Lilda low code data large screen, leveling the threshold of data application development
I want to register my stock account online. How do I do it? Is online account opening safe?
Webrtc advantages and module splitting
Rman Backup Report Ora - 19809 Ora - 19804
Basic operation of PMP from applying for the exam to obtaining the certificate, a must see for understanding PMP
Import and export of a single collection in postman
Error: `brew cask` is no longer a `brew` command. Use `brew <command> --cask` instead.
【无标题】
The Falling Leaves
批量修改表和表中字段排序规则
小程序 :遍历list里面的某个数组的值,等同于 for=“list” list.comment里面的某一项
Implementation of single sign on
硬盘基本知识(磁头、磁道、扇区、柱面)
股票 停牌
如何抑制SiC MOSFET Crosstalk(串扰)?
Is it safe to open an account for online stock speculation?
如何实现基于 RADIUS 协议的双因子认证 MFA?
我想网上注册股票开户,如何操作?在线开户安全么?
