当前位置:网站首页>Spark Tuning (II): UDF reduces joins and judgments
Spark Tuning (II): UDF reduces joins and judgments
2022-07-06 23:09:00 【InfoQ】
1. cause
2. Optimization starts
2.1 Change to java Code writing program
2.2 Use UDF
public class UDF implements UDF2<Long, Long, Long> {
Map<Long, TreeMap<Long, Long>> map;
public TripUDF(Broadcast<Map<Long, TreeMap<Long, Long>>> bmap) {
this.map = bmap.getValue();
}
@Override
public Long call(Long id, Long time) throws Exception {
if (map.containsKey(terminalId)) {
Map.Entry<Long, Long> a = map.get(id).floorEntry(time);
Map.Entry<Long, Long> b = map.get(id).ceilingEntry(time);
if (null != a && null != b) {
if (a.getValue().equals(b.getValue())) {
return a.getValue();
}
}
}
return -1L;
}
}
tablea join tableb
on tablea.id=tableb.id and
tablea.time >= tableb.timeStart and
tablea.time <= tableb.timeEnd
String udfMethod = "structureMap";
spark.udf().register(udfMethod, new UDF(broadcast1), DataTypes.StringType);
select id,time,structureMap(id,time) as tag from tablea
Conclusion
边栏推荐
- DevSecOps软件研发安全实践——发布篇
- docker启动mysql及-eMYSQL_ROOT_PASSWORD=my-secret-pw问题解决
- 「小程序容器技术」,是噱头还是新风口?
- 2014阿里巴巴web前实习生项目分析(1)
- TypeScript获取函数参数类型
- (DART) usage supplement
- OpenSSL:适用TLS与SSL协议的全功能工具包,通用加密库
- Aardio - Method of batch processing attributes and callback functions when encapsulating Libraries
- None of the strongest kings in the monitoring industry!
- Thinkphp5 multi table associative query method join queries two database tables, and the query results are spliced and returned
猜你喜欢
金融人士必读书籍系列之六:权益投资(基于cfa考试内容大纲和框架)
动作捕捉用于蛇运动分析及蛇形机器人开发
Enterprises do not want to replace the old system that has been used for ten years
专为决策树打造,新加坡国立大学&清华大学联合提出快速安全的联邦学习新系统
Introduction to network basics
dockermysql修改root账号密码并赋予权限
欧洲生物信息研究所2021亮点报告发布:采用AlphaFold已预测出近1百万个蛋白质
None of the strongest kings in the monitoring industry!
Improving Multimodal Accuracy Through Modality Pre-training and Attention
Aardio - integrate variable values into a string of text through variable names
随机推荐
The difference between enumeration and define macro
Jafka来源分析——Processor
Comparison between variable and "zero value"
Huawei cloud gaussdb (for redis) unveils issue 21: using Gauss redis to achieve secondary indexing
Void keyword
使用云服务器搭建代理
None of the strongest kings in the monitoring industry!
Volatile keyword
Introduction to network basics
View
How does crmeb mall system help marketing?
[unity] upgraded version · Excel data analysis, automatically create corresponding C classes, automatically create scriptableobject generation classes, and automatically serialize asset files
面试题:AOF重写机制,redis面试必问!!!
前置机是什么意思?主要作用是什么?与堡垒机有什么区别?
Project duplicate template
监控界的最强王者,没有之一!
企业不想换掉用了十年的老系统
Pytest unit test series [v1.0.0] [pytest execute unittest test case]
Jafka source analysis processor
#DAYU200体验官# 在DAYU200运行基于ArkUI-eTS的智能晾晒系统页面