当前位置:网站首页>Spark entry learning-2
Spark entry learning-2
2022-08-03 16:01:00 【@Autowire】
1 Dependency
Wide dependencies: with shuffle
One partition of the parent RDD will be depended on by multiple partitions of the child RDD
Narrow dependencies: no shuffle
One partition of the parent RDD will only be depended on by one partition of the child RDD
Summary:
Narrow dependencies: parallelization + fault tolerance
WideDependency: perform stage division (the stage after shuffle needs to wait for shuffle to execute.
2 DAG && Stage
Spark's DAG: is the flow chart of spark task/program execution!
The beginning of DAG: from the creation of RDD
The end of DAG: to the end of Action
There are several DAGs in a Spark program by several Action operations
Stage: It is the stage divided by shuffle in DAG!
The latter stage can be executed only after the previous stage is executed.
Each task in the same stage can be executed in parallel without waiting!
3 Glossary
4 Job Submission Process
边栏推荐
- Internship Road: Documenting Confusion in My First Internship Project
- ReentrantReadWriteLock详解
- ECCV 2022 | 基于关系查询的时序动作检测方法
- 技术干货|如何将 Pulsar 数据快速且无缝接入 Apache Doris
- 30W 2C(JD6606S + FP6652X2)BOM
- spark入门学习-2
- 2021年12月电子学会图形化一级编程题解析含答案:下雨
- 【QT】Qt 给已经开发好的程序快速封装成动态库
- NodeJs - cross domain
- JS handwritten call apply bind (detailed) (interview)
猜你喜欢
DC-DC 2C (40W/30W) JD6606SX2 power back application
cnpm 安装成功后提示不是内部和外部命令,也不是可运行的命令解决方案
Small Tools(4) 整合Seata1.5.2分布式事务
AI也有健忘症?英国41岁教授专访:解决灾难性遗忘
2021年12月电子学会图形化二级编程题解析含答案:消灭蝙蝠
【Unity入门计划】基本概念(6)-精灵渲染器 Sprite Renderer
高压直流输电(HVDC)的最优潮流(OPF)(Matlab代码实现)
使用Make/CMake编译ARM裸机程序(基于HT32F52352 Cortex-M0+)
leetcode: 899. Ordered Queue [Thinking Question]
30W 2C(JD6606S + FP6652X2)BOM
随机推荐
爬虫注意
并发编程的核心问题
MySQL中的基数是啥?
语音识别新一轮竞争打响,自然对话会是下一个制高点吗?
How to use binary search and find whether the rotation in the array contains a (target) value?Rotate the sorted array leetcode 81. Search
Go Go 简单的很,标准库之 fmt 包的一键入门
使用VS Code搭建ESP-IDF环境
How Navicat connects to MySQL on a remote server
Small Tools(4) 整合Seata1.5.2分布式事务
opencv 读取和写入路径有汉字的处理方法
不可忽略!户外LED显示屏的特点及优势
【Unity入门计划】基本概念(8)-瓦片地图 TileMap 01
劲爆!协程终于来了!线程即将是过去式
开源一夏 | 打工人的第25天-曾经的考研人
小熊派——无线联网开发
使用Make/CMake编译ARM裸机程序(基于HT32F52352 Cortex-M0+)
js数组方法总结
【数据库数据恢复】SqlServer数据库无法读取的数据恢复案例
LyScript 验证PE程序开启的保护
MATLAB gcf图窗保存图像,黑色背景/透明背景