当前位置:网站首页>spark调优(二):UDF减少JOIN和判断
spark调优(二):UDF减少JOIN和判断
2022-07-06 15:38:00 【InfoQ】
1. 起因
2. 优化开始
2.1 改成java代码编写程序
2.2 使用UDF
public class UDF implements UDF2<Long, Long, Long> {
Map<Long, TreeMap<Long, Long>> map;
public TripUDF(Broadcast<Map<Long, TreeMap<Long, Long>>> bmap) {
this.map = bmap.getValue();
}
@Override
public Long call(Long id, Long time) throws Exception {
if (map.containsKey(terminalId)) {
Map.Entry<Long, Long> a = map.get(id).floorEntry(time);
Map.Entry<Long, Long> b = map.get(id).ceilingEntry(time);
if (null != a && null != b) {
if (a.getValue().equals(b.getValue())) {
return a.getValue();
}
}
}
return -1L;
}
}
tablea join tableb
on tablea.id=tableb.id and
tablea.time >= tableb.timeStart and
tablea.time <= tableb.timeEnd
String udfMethod = "structureMap";
spark.udf().register(udfMethod, new UDF(broadcast1), DataTypes.StringType);
select id,time,structureMap(id,time) as tag from tablea
结束语
边栏推荐
- 自定义 swap 函数
- Aardio - does not declare the method of directly passing float values
- three. JS gorgeous bubble effect
- three.js绚烂的气泡效果
- memcached
- Unified Focal loss: Generalising Dice and cross entropy-based losses to handle class imbalanced medi
- Is "applet container technology" a gimmick or a new outlet?
- [launched in the whole network] redis series 3: high availability of master-slave architecture
- Word2vec (skip gram and cbow) - pytorch
- Children's pajamas (Australia) as/nzs 1249:2014 handling process
猜你喜欢
mysql拆分字符串作为查询条件的示例代码

Balanced Multimodal Learning via On-the-fly Gradient Modulation(CVPR2022 oral)

室内LED显示屏应该怎么选择?这5点注意事项必须考虑在内

【Unity】升级版·Excel数据解析,自动创建对应C#类,自动创建ScriptableObject生成类,自动序列化Asset文件

None of the strongest kings in the monitoring industry!

Cloud native technology container knowledge points

CUDA exploration

Adavit -- dynamic network with adaptive selection of computing structure

(flutter2) as import old project error: inheritfromwidgetofexacttype

Custom swap function
随机推荐
The statement that allows full table scanning does not seem to take effect set odps sql. allow. fullscan=true; I
Precise drag and drop within a contentable
Jafka source analysis processor
室内LED显示屏应该怎么选择?这5点注意事项必须考虑在内
HDU 5077 NAND (violent tabulation)
Let's see through the network i/o model from beginning to end
How to confirm the storage mode of the current system by program?
Project duplicate template
[untitled]
OpenSSL: a full-featured toolkit for TLS and SSL protocols, and a general encryption library
OpenNMS分离数据库
(DART) usage supplement
TypeScript获取函数参数类型
Return keyword
华为云GaussDB(for Redis)揭秘第21期:使用高斯Redis实现二级索引
rust知识思维导图xmind
DR-Net: dual-rotation network with feature map enhancement for medical image segmentation
Aardio - does not declare the method of directly passing float values
Sword finger offer question brushing record 1
DevSecOps软件研发安全实践——发布篇