当前位置:网站首页>Aggregate function with key in spark
Aggregate function with key in spark
2022-07-06 21:43:00 【Big data Xiaochen】
RDD Every element of is 【 Key value pair 】 To call the following functions .
groupByKey
aggregateByKey
rdd = sc.parallelize([('a', 1), ('b', 1), ('a', 1), ('b', 1), ('a', 1)], 2)
When aggregating in the following partition , The initial value will participate in the calculation , When aggregating between partitions , The initial value will not participate in the calculation .
foldByKey
foldByKey By aggregateByKey Simplify
When aggregateByKey The logic of aggregation functions within and between partitions of is the same , It can be omitted as a , It becomes foldByKey.
reduceByKey
reduceByKey By foldByKey Simplify
When foldByKey When the initial value of is meaningless , You can omit it
边栏推荐
- Redistemplate common collection instructions opsforzset (VI)
- The use method of string is startwith () - start with XX, endswith () - end with XX, trim () - delete spaces at both ends
- KDD 2022 | realize unified conversational recommendation through knowledge enhanced prompt learning
- 通过数字电视通过宽带网络取代互联网电视机顶盒应用
- mysql根据两个字段去重
- C语言:#if、#def和#ifndef综合应用
- 强化学习-学习笔记5 | AlphaGo
- Fastjson parses JSON strings (deserialized to list, map)
- 039. (2.8) thoughts in the ward
- b站视频链接快速获取
猜你喜欢
[interpretation of the paper] machine learning technology for Cataract Classification / classification
Numpy download and installation
JPEG2000-Matlab源码实现
【Redis设计与实现】第一部分 :Redis数据结构和对象 总结
Summary of cross partition scheme
JPEG2000 matlab source code implementation
ViT论文详解
对话阿里巴巴副总裁贾扬清:追求大模型,并不是一件坏事
Leetcode topic [array] -118 Yang Hui triangle
20220211 failure - maximum amount of data supported by mongodb
随机推荐
JS operation DOM element (I) -- six ways to obtain DOM nodes
guava:Collections.unmodifiableXXX创建的collection并不immutable
MySQL - transaction details
Fastjson parses JSON strings (deserialized to list, map)
Enhance network security of kubernetes with cilium
[redis design and implementation] part I: summary of redis data structure and objects
Technology sharing | packet capturing analysis TCP protocol
袁小林:安全不只是标准,更是沃尔沃不变的信仰和追求
PostgreSQL install GIS plug-in create extension PostGIS_ topology
Divide candy
数字化转型挂帅复产复工,线上线下全融合重建商业逻辑
HMS core machine learning service creates a new "sound" state of simultaneous interpreting translation, and AI makes international exchanges smoother
JS learning notes OO create suspicious objects
Absolute primes (C language)
@Detailed differences among getmapping, @postmapping and @requestmapping, with actual combat code (all)
The relationship between root and coefficient of quadratic equation with one variable
互联网快讯:吉利正式收购魅族;胰岛素集采在31省全面落地
The difference between break and continue in the for loop -- break completely end the loop & continue terminate this loop
jvm:大对象在老年代的分配
Set up a time server