当前位置:网站首页>RDD的执行原理
RDD的执行原理
2022-06-24 06:44:00 【斯沃福德】
从计算的角度来讲,数据处理过程中需要计算资源(内存 & CPU)和计算模型(逻辑)。执行时,需要将计算资源和计算模型进行协调和整合。
流程概括:
①准备资源
②创建Driver和Executor节点
②然后将应用程序的数据处理逻辑分解成一个一个的计算任务task。
③然后将任务task发到已经分配资源的计算节点executor上, 按照指定的计算模型进行数据计算。最后得到计算结果
1. 启动 Yarn 集群环境(准备资源)

2. Spark 通过申请资源创建调度节点Driver和计算节点Executor

Driver和Executor都是运行在NodeManager上面的 !
ResourceManager是用于管理的,所以真正运行任务的是NodeManager
3. Spark 框架根据需求将计算逻辑根据分区划分成不同的task任务

Driver用于在Executor节点之间调度task任务
多个RDD会组合形成关联,再分解为多个Task任务,并放到TaskPool任务池中(因为需要调度task任务)
4. 调度节点Driver将任务根据计算节点状态发送到对应的计算节点进行计算

调度节点Driver会将Task从任务池中取出,然后根据节点状态、首选位置来发送到不同的Executor进行计算
从以上流程可以看出 RDD 在整个流程中主要用于将逻辑进行封装,并生成 Task 发送给Executor 节点执行计算
边栏推荐
- [Proteus] Arduino uno + ds1307+lcd1602 time display
- Accessing user interface settings using systemparametersinfo
- Global and Chinese market of digital fryer 2022-2028: Research Report on technology, participants, trends, market size and share
- L2tp/ipsec one click installation script
- Face pincher: a hot meta universe stylist
- Only two lines are displayed, and the excess part is displayed with Ellipsis
- [WUSTCTF2020]爬
- 6000多万铲屎官,捧得出一个国产主粮的春天吗?
- 只显示两行,超出部分省略号显示
- [image fusion] multi focus and multi spectral image fusion based on pixel saliency and wavelet transform with matlab code
猜你喜欢

Description of module data serial number positioning area code positioning refers to GBK code

Prefix and topic training

When MFC uses the console, the project path cannot have spaces or Chinese, otherwise an error will be reported. Lnk1342 fails to save the backup copy of the binary file to be edited, etc

MySQL - three tables (student, course, score) to query the name, number and score of students whose course is mathematics

Bjdctf 2020 Bar _ Babystack

get_started_3dsctf_2016

Buuctf misc grab from the doll

Win10 build webservice
![[OGeek2019]babyrop](/img/74/5f93dcee9ea5a562a7fba5c17aab76.png)
[OGeek2019]babyrop

Camera calibration (calibration purpose and principle)
随机推荐
The initial user names and passwords of Huawei devices are a large collection that engineers involved in Huawei business should keep in mind and collect!
When MFC uses the console, the project path cannot have spaces or Chinese, otherwise an error will be reported. Lnk1342 fails to save the backup copy of the binary file to be edited, etc
C# Lambda
Accessing user interface settings using systemparametersinfo
PIP install XXX on the terminal but no module named XXX on pycharm
【008】表格数据逐行筛选,跳出for循环及跳过本次循环思路_#VBA
阿里云全链路数据治理
The fund management of London gold is more important than others
Muxvlan principle, Huawei MUX VLAN experimental configuration
PCL calculates the area of a polygon
What is the mentality of spot gold worth learning from
Win10 build webservice
[tips] use the deep learning toolbox of MATLAB deepnetworkdesigner to quickly design
MySQL - three tables (student, course, score) to query the name, number and score of students whose course is mathematics
Global and Chinese market of digital fryer 2022-2028: Research Report on technology, participants, trends, market size and share
10 common malware detection and analysis platforms
[Lua language from bronze to king] Part 2: development environment construction +3 editor usage examples
Overview of C program operation mechanism
Several misunderstandings of VPN
Spark stage and shuffle for daily data processing