当前位置:网站首页>Solve the problem of too many small files
Solve the problem of too many small files
2022-07-06 09:34:00 【Prism 7】
List of articles
1. Use hive Self contained concatenate command , Automatically merge small files
Usage method :
2. Adjust the parameters to reduce map Number
In execution map Forward small file merge , stay mapper Combine multiple files into one split As input . adjustment split At least the size of
3. Reduce Reduce The number of
reduce The number of output files depends on the number of output files , So you can adjust reduce Control the number of hive The number of files in the table .
4. HAR file
Use hadoop Of archive File small files , Can package multiple small files into one har file
5. jvm reusing
Hadoop The default configuration is usually to use derivation JVM To execute map and Reduce Mission . At this time JVM The startup process can be quite expensive , Especially for execution job There are hundreds of thousands of them task Mission status .JVM Reuse allows JVM Instance in the same job Reuse in N Time .
The downside of this feature is that , Turn on JVM Reuse will always be used to task slot , For reuse , Not released until the mission is complete .
边栏推荐
- IDS cache preheating, avalanche, penetration
- go-redis之初始化連接
- Design and implementation of film and television creation forum based on b/s (attached: source code paper SQL file project deployment tutorial)
- QML control type: menu
- Global and Chinese market of appointment reminder software 2022-2028: Research Report on technology, participants, trends, market size and share
- [deep learning] semantic segmentation: paper reading: (2021-12) mask2former
- Design and implementation of online shopping system based on Web (attached: source code paper SQL file)
- Redis之cluster集群
- AcWing 2456. 记事本
- 基于B/S的网上零食销售系统的设计与实现(附:源码 论文 Sql文件)
猜你喜欢
数据建模有哪些模型
Detailed explanation of cookies and sessions
Le modèle sentinelle de redis
[Yu Yue education] reference materials of complex variable function and integral transformation of Shenyang University of Technology
Redis' bitmap
leetcode-14. Longest common prefix JS longitudinal scanning method
Advanced Computer Network Review(3)——BBR
Lua script of redis
MapReduce instance (IV): natural sorting
QML type: locale, date
随机推荐
Redis core configuration
Leetcode problem solving 2.1.1
go-redis之初始化連接
Kratos战神微服务框架(二)
068.查找插入位置--二分查找
Nacos installation and service registration
Redis之连接redis服务命令
go-redis之初始化连接
基于B/S的影视创作论坛的设计与实现(附:源码 论文 sql文件 项目部署教程)
Oom happened. Do you know the reason and how to solve it?
Mysql database recovery (using mysqlbinlog command)
一文读懂,DDD落地数据库设计实战
[deep learning] semantic segmentation: paper reading: (2021-12) mask2former
Design and implementation of online snack sales system based on b/s (attached: source code paper SQL file)
Global and Chinese markets for small seed seeders 2022-2028: Research Report on technology, participants, trends, market size and share
QML control type: menu
【深度学习】语义分割:论文阅读:(2021-12)Mask2Former
Once you change the test steps, write all the code. Why not try yaml to realize data-driven?
IDS cache preheating, avalanche, penetration
[Yu Yue education] Wuhan University of science and technology securities investment reference