当前位置:网站首页>Rational selection of (Spark Tuning ~) operator
Rational selection of (Spark Tuning ~) operator
2022-07-27 00:58:00 【A photographer who can't play is not a good programmer】
1.map And mappartition
1.map It's right RDD Each element in acts on a function
2.mappartition Is to apply a function to each partition
If you need to write the data to the database , Be sure to use mappartition
2.foreach And foreachpartition
Be similar to map And mappartition
The difference is that :foreach It's an action operator ,map It's a transformation operator
3.groupByKey And reduceByKey
1.groupByKey
All the data has been shuffle.
2.reduceByKey
Will be first map Make a local aggregation at the end , Then aggregate the data shuffle operation (map End prepolymerization )
( This method is preferred )
4.collect operator
The data of the execution result is all in an array of Van Gogh ( It can lead to OOM) Use with caution !
5.coalesce And repartition
The function of both is to change the number of partitions
1.coalesce operator
When the number of partitions is reduced, there will be no shuffle,(data.coalesce(1))
When the number of partitions exceeds the default , There will be shuffle
It is generally used in multi partition and less partition
2.repartition operator
repartition Operator bottom call coalesce(shuffle = true), There will be shuffle
边栏推荐
猜你喜欢
![[By Pass] 文件上传的绕过方式](/img/72/d3e46a820796a48b458cd2d0a18f8f.png)
[By Pass] 文件上传的绕过方式
![[watevrCTF-2019]Cookie Store](/img/24/8baaa1ac9daa62c641472d5efac895.png)
[watevrCTF-2019]Cookie Store

数据仓库知识点

CUDA version difference between NVIDIA SMI and nvcc -v

Detailed explanation of CSRF forged user request attack

VMware Workstation 虚拟机启动就直接蓝屏重启问题解决

JSCORE day_01(6.30) RegExp 、 Function
![[CTF攻防世界] WEB区 关于备份的题目](/img/af/b78eb3522160896d77d9e82f7e7810.png)
[CTF攻防世界] WEB区 关于备份的题目

Flink1.11 intervalJoin watermark生成,状态清理机制源码理解&Demo分析

DOM day_01(7.7) dom的介绍和核心操作
随机推荐
Detailed explanation of CSRF forged user request attack
JSCORE day_05(7.6)
[HarekazeCTF2019]encode_ and_ encode
[CISCN2019 华东南赛区]Double Secret
[BJDCTF2020]EzPHP
[ciscn2019 North China division Day1 web2]ikun
[问题]yum资源被占用怎么办
Flink面试常见的25个问题(无答案)
箭头函数详解 2021-04-30
[ciscn2019 finals Day2 web1]easyweb
07 - 日志服务器的搭建与攻击
哪个证券公司开户股票佣金低,哪个股票开户安全
Leetcode 302 weekly games
[BJDCTF2020]EzPHP
Flask学习最佳入门指南
Neo4j基础指南(安装,节点和关系数据导入,数据查询)
Flink 滑动窗口理解&具体业务场景介绍
[CTF攻防世界] WEB区 关于备份的题目
flink1.11 sql本地运行demo & 本地webUI可视解决
[By Pass] 文件上传的绕过方式