当前位置:网站首页>Spark entry learning-2
Spark entry learning-2
2022-08-03 16:01:00 【@Autowire】
1 Dependency



Wide dependencies: with shuffle
One partition of the parent RDD will be depended on by multiple partitions of the child RDD
Narrow dependencies: no shuffle
One partition of the parent RDD will only be depended on by one partition of the child RDD
Summary:
Narrow dependencies: parallelization + fault tolerance
WideDependency: perform stage division (the stage after shuffle needs to wait for shuffle to execute.
2 DAG && Stage


Spark's DAG: is the flow chart of spark task/program execution!
The beginning of DAG: from the creation of RDD
The end of DAG: to the end of Action
There are several DAGs in a Spark program by several Action operations
Stage: It is the stage divided by shuffle in DAG!
The latter stage can be executed only after the previous stage is executed.
Each task in the same stage can be executed in parallel without waiting!
3 Glossary




4 Job Submission Process

边栏推荐
- 不安装运行时运行.NET程序
- How to play deep paging with hundreds of millions of data?Compatible with MySQL + ES + MongoDB
- 5 v 8.4 v1A charging current charging management IC
- 49 万奖金等你来拿!第四届实时计算 Flink 挑战赛启动,Beyond Stream Processing!
- 基于DMS的数仓智能运维服务,知多少?
- STM32 GPIO LED和蜂鸣器实现【第四天】
- 一个文件管理系统的软硬件配置清单
- 基于DMS的数仓智能运维服务,知多少?
- 30W 2C(JD6606S + FP6652X2)BOM
- 【QT】Qt项目demo:数据在ui界面上显示,鼠标双击可弹窗显示具体信息
猜你喜欢
随机推荐
【QT】Qt 给已经开发好的程序快速封装成动态库
泰山OFFICE技术讲座:段落边框的绘制难点在哪里?
ModelWhale 云端运行 WRF 中尺度数值气象模式,随时随地即开即用的一体化工作流
JS基础--判断
spark入门学习-2
How to play deep paging with hundreds of millions of data?Compatible with MySQL + ES + MongoDB
ruoyi若依框架@DataScope注解使用以及碰到的一些问题
语音识别新一轮竞争打响,自然对话会是下一个制高点吗?
方舟开服工具、服务器教程win
基于牛顿方法在直流微电网潮流研究(Matlab代码实现)
How to get the 2 d space prior to ViT?UMA & Hong Kong institute of technology & ali SP - ViT, study for visual Transformer 2 d space prior knowledge!.
STM32的HAL和LL库区别和性能对比
【QT】Qt项目demo:数据在ui界面上显示,鼠标双击可弹窗显示具体信息
《安富莱嵌入式周报》第276期:2022.07.25--2022.07.31
ECCV 2022 | Relational Query-Based Temporal Action Detection Methods
使用VS Code搭建ESP-IDF环境
AWS中国区SDN Connector
Ark server open tool, server tutorial win
13、OOM模拟
Convex Optimization of Optimal Power Flow (OPF) in Microgrids and DC Grids (Matlab Code Implementation)









