当前位置:网站首页>Small file special
Small file special
2022-07-03 10:47:00 【Samooyou】
Hive Small file merge parameters ;
Spark Small file merging ideas :
Adoption community SPARK-24940 How to deal with , With the help of SQL hint Way to merge small files .
|
Add auto merge small file result file .
- The user side : When spark.sql.shuffle.partitions The setting is relatively large and the result data set is relatively small , There's a lot of small files , newly added spark.sql.result.partitions Parameter to control the number of final output files .
- Platform side : Trigger small file detection when the data falls into the disk , stay InsertIntoHiveTable If small file merge is enabled , And the average size of the file is lower than the threshold, the merge is performed , Do it after merging loadTable perhaps loadPartition operation .( The platform side is enabled by default )
Dynamic setting Shuffle Partition.
Spark Adaptive Execution Function support Shuffle Operate downstream Stage According to the upstream Stage Produced Shuffle Data volume automatically adjusts downstream Stage Of Task Count , namely Shuffle Read Multiple small files Partition hand
边栏推荐
猜你喜欢

ThreadLocal原理及使用场景

Pytoch has been installed, but vs code still displays no module named 'torch‘

大型电商项目-环境搭建

Ind kwf first week

丢弃法Dropout(Pytorch)

Ut2017 learning notes

Class-Variant Margin Normalized Softmax Loss for Deep Face Recognition

帶你走進雲原生數據庫界扛把子Amazon Aurora

带你走进云原生数据库界扛把子Amazon Aurora

Linear regression of introduction to deep learning (pytorch)
随机推荐
Ut2015 learning notes
Practical part: conversion of Oracle Database Standard Edition (SE) to Enterprise Edition (EE)
Traversal of map set
Knowledge map reasoning -- hybrid neural network and distributed representation reasoning
Leetcode刷题---852
2021-09-22
Leetcode skimming ---263
安装yolov3(Anaconda)
熵值法求权重
QT:QSS自定义 QSplitter实例
Ind FHL first week
Flink--自定义函数
UI interface design related knowledge (I)
Leetcode刷题---1385
8、 Transaction control language of MySQL
Tensorflow—Image segmentation
Flink <-->Redis的使用介绍+with参数
Leetcode skimming ---202
Windows security center open blank
Knowledge map enhancement recommendation based on joint non sampling learning