当前位置:网站首页>spark调优(二):UDF减少JOIN和判断
spark调优(二):UDF减少JOIN和判断
2022-07-06 15:38:00 【InfoQ】
1. 起因
2. 优化开始
2.1 改成java代码编写程序
2.2 使用UDF
public class UDF implements UDF2<Long, Long, Long> {
Map<Long, TreeMap<Long, Long>> map;
public TripUDF(Broadcast<Map<Long, TreeMap<Long, Long>>> bmap) {
this.map = bmap.getValue();
}
@Override
public Long call(Long id, Long time) throws Exception {
if (map.containsKey(terminalId)) {
Map.Entry<Long, Long> a = map.get(id).floorEntry(time);
Map.Entry<Long, Long> b = map.get(id).ceilingEntry(time);
if (null != a && null != b) {
if (a.getValue().equals(b.getValue())) {
return a.getValue();
}
}
}
return -1L;
}
}
tablea join tableb
on tablea.id=tableb.id and
tablea.time >= tableb.timeStart and
tablea.time <= tableb.timeEnd
String udfMethod = "structureMap";
spark.udf().register(udfMethod, new UDF(broadcast1), DataTypes.StringType);
select id,time,structureMap(id,time) as tag from tablea
结束语
边栏推荐
- Return keyword
- CUDA exploration
- How to choose the server system
- UE4 blueprint learning chapter (IV) -- process control forloop and whileloop
- Mysql 身份认证绕过漏洞(CVE-2012-2122)
- Method of canceling automatic watermarking of uploaded pictures by CSDN
- Bipartite graph determination
- 项目复盘模板
- [compilation principle] LR (0) analyzer half done
- MATLAB小技巧(27)灰色预测
猜你喜欢
Financial professionals must read book series 6: equity investment (based on the outline and framework of the CFA exam)
Sword finger offer question brushing record 1
【全网首发】Redis系列3:高可用之主从架构的
Bipartite graph determination
Word2vec (skip gram and cbow) - pytorch
自定义 swap 函数
Balanced Multimodal Learning via On-the-fly Gradient Modulation(CVPR2022 oral)
The problem that dockermysql cannot be accessed by the host machine is solved
Dayu200 experience officer homepage AITO video & Canvas drawing dashboard (ETS)
Signed and unsigned keywords
随机推荐
memcached
On file uploading of network security
Use ECs to set up an agent
Method of canceling automatic watermarking of uploaded pictures by CSDN
Case recommendation: An Qing works with partners to ensure that the "smart court" is more efficient
HDU 5077 NAND (violent tabulation)
ICLR 2022 | 基于对抗自注意力机制的预训练语言模型
CRMEB商城系统如何助力营销?
UVa 11732 – strcmp() Anyone?
UVa 11732 – strcmp() Anyone?
rust知识思维导图xmind
监控界的最强王者,没有之一!
Windows Auzre 微软的云计算产品的后台操作界面
室内LED显示屏应该怎么选择?这5点注意事项必须考虑在内
How to achieve text animation effect
Balanced Multimodal Learning via On-the-fly Gradient Modulation(CVPR2022 oral)
Extern keyword
docker中mysql开启日志的实现步骤
QT signal and slot
OpenSSL:适用TLS与SSL协议的全功能工具包,通用加密库