当前位置:网站首页>Spark Tuning (II): UDF reduces joins and judgments
Spark Tuning (II): UDF reduces joins and judgments
2022-07-06 23:09:00 【InfoQ】
1. cause
2. Optimization starts
2.1 Change to java Code writing program
2.2 Use UDF
public class UDF implements UDF2<Long, Long, Long> {
Map<Long, TreeMap<Long, Long>> map;
public TripUDF(Broadcast<Map<Long, TreeMap<Long, Long>>> bmap) {
this.map = bmap.getValue();
}
@Override
public Long call(Long id, Long time) throws Exception {
if (map.containsKey(terminalId)) {
Map.Entry<Long, Long> a = map.get(id).floorEntry(time);
Map.Entry<Long, Long> b = map.get(id).ceilingEntry(time);
if (null != a && null != b) {
if (a.getValue().equals(b.getValue())) {
return a.getValue();
}
}
}
return -1L;
}
}
tablea join tableb
on tablea.id=tableb.id and
tablea.time >= tableb.timeStart and
tablea.time <= tableb.timeEnd
String udfMethod = "structureMap";
spark.udf().register(udfMethod, new UDF(broadcast1), DataTypes.StringType);
select id,time,structureMap(id,time) as tag from tablea
Conclusion
边栏推荐
猜你喜欢
dockermysql修改root账号密码并赋予权限
Let's see through the network i/o model from beginning to end
Modules that can be used by both the electron main process and the rendering process
On the problems of born charge and non analytical correction in phonon and heat transport calculations
European Bioinformatics Institute 2021 highlights report released: nearly 1million proteins have been predicted by alphafold
Aardio - construct a multi button component with customplus library +plus
Method of canceling automatic watermarking of uploaded pictures by CSDN
MySQL实现字段分割一行转多行的示例代码
Custom swap function
DR-Net: dual-rotation network with feature map enhancement for medical image segmentation
随机推荐
Aardio - Method of batch processing attributes and callback functions when encapsulating Libraries
Slide the uniapp to a certain height and fix an element to the top effect demo (organize)
What are the specific steps and schedule of IELTS speaking?
Pytest unit test series [v1.0.0] [pytest execute unittest test case]
[compilation principle] LR (0) analyzer half done
TypeScript获取函数参数类型
[step on pit collection] attempting to deserialize object on CUDA device+buff/cache occupy too much +pad_ sequence
Thinkphp5 multi table associative query method join queries two database tables, and the query results are spliced and returned
MATLAB小技巧(27)灰色预测
HDU 5077 NAND (violent tabulation)
docker mysql5.7如何设置不区分大小写
服务器的系统怎么选者
Precise drag and drop within a contentable
Dayu200 experience officer homepage AITO video & Canvas drawing dashboard (ETS)
UVa 11732 – strcmp() Anyone?
Volatile keyword
Demonstration of the development case of DAPP system for money deposit and interest bearing financial management
欧洲生物信息研究所2021亮点报告发布:采用AlphaFold已预测出近1百万个蛋白质
OpenNMS分离数据库
NFTScan 开发者平台推出 Pro API 商业化服务