当前位置:网站首页>Aggregate function with key in spark
Aggregate function with key in spark
2022-07-06 21:43:00 【Big data Xiaochen】
RDD Every element of is 【 Key value pair 】 To call the following functions .
groupByKey
aggregateByKey
rdd = sc.parallelize([('a', 1), ('b', 1), ('a', 1), ('b', 1), ('a', 1)], 2)
When aggregating in the following partition , The initial value will participate in the calculation , When aggregating between partitions , The initial value will not participate in the calculation .
foldByKey
foldByKey By aggregateByKey Simplify
When aggregateByKey The logic of aggregation functions within and between partitions of is the same , It can be omitted as a , It becomes foldByKey.
reduceByKey
reduceByKey By foldByKey Simplify
When foldByKey When the initial value of is meaningless , You can omit it
边栏推荐
- 038. (2.7) less anxiety
- 缓存更新策略概览(Caching Strategies Overview)
- Search map website [quadratic] [for search map, search fan, search book]
- Set up a time server
- Web开发小妙招:巧用ThreadLocal规避层层传值
- JPEG2000-Matlab源码实现
- PostgreSQL 安装gis插件 CREATE EXTENSION postgis_topology
- Ravendb starts -- document metadata
- Comparison between multithreaded CAS and synchronized
- Is this the feeling of being spoiled by bytes?
猜你喜欢
Vit paper details
[sliding window] group B of the 9th Landbridge cup provincial tournament: log statistics
guava:Collections. The collection created by unmodifiablexxx is not immutable
It's not my boast. You haven't used this fairy idea plug-in!
Chris LATTNER, the father of llvm: why should we rebuild AI infrastructure software
Digital transformation takes the lead to resume production and work, and online and offline full integration rebuilds business logic
OneNote in-depth evaluation: using resources, plug-ins, templates
PostgreSQL 安装gis插件 CREATE EXTENSION postgis_topology
KDD 2022 | realize unified conversational recommendation through knowledge enhanced prompt learning
袁小林:安全不只是标准,更是沃尔沃不变的信仰和追求
随机推荐
Five wars of Chinese Baijiu
Fastjson parses JSON strings (deserialized to list, map)
Summary of cross partition scheme
在最长的距离二叉树结点
js 根据汉字首字母排序(省份排序) 或 根据英文首字母排序——za排序 & az排序
PostgreSQL install GIS plug-in create extension PostGIS_ topology
038. (2.7) less anxiety
string的底层实现
抖音将推独立种草App“可颂”,字节忘不掉小红书?
Redistemplate common collection instructions opsforset (V)
R language for text mining Part4 text classification
Why do job hopping take more than promotion?
The role of applicationmaster in spark on Yan's cluster mode
Sdl2 source analysis 7: performance (sdl_renderpresent())
JS operation DOM element (I) -- six ways to obtain DOM nodes
MySQL - transaction details
HMS core machine learning service creates a new "sound" state of simultaneous interpreting translation, and AI makes international exchanges smoother
The difference between break and continue in the for loop -- break completely end the loop & continue terminate this loop
Microsoft technology empowerment position - February course Preview
How do I remove duplicates from the list- How to remove duplicates from a list?