当前位置:网站首页>Solve the problem of too many small files
Solve the problem of too many small files
2022-07-06 09:34:00 【Prism 7】
List of articles
1. Use hive Self contained concatenate command , Automatically merge small files
Usage method :
2. Adjust the parameters to reduce map Number
In execution map Forward small file merge , stay mapper Combine multiple files into one split As input . adjustment split At least the size of
3. Reduce Reduce The number of
reduce The number of output files depends on the number of output files , So you can adjust reduce Control the number of hive The number of files in the table .
4. HAR file
Use hadoop Of archive File small files , Can package multiple small files into one har file
5. jvm reusing
Hadoop The default configuration is usually to use derivation JVM To execute map and Reduce Mission . At this time JVM The startup process can be quite expensive , Especially for execution job There are hundreds of thousands of them task Mission status .JVM Reuse allows JVM Instance in the same job Reuse in N Time .
The downside of this feature is that , Turn on JVM Reuse will always be used to task slot , For reuse , Not released until the mission is complete .
边栏推荐
- AcWing 2456. Notepad
- [Yu Yue education] reference materials of power electronics technology of Jiangxi University of science and technology
- 六月刷题01——数组
- LeetCode41——First Missing Positive——hashing in place & swap
- 七层网络体系结构
- Redis geospatial
- 基于WEB的网上购物系统的设计与实现(附:源码 论文 sql文件)
- 解决小文件处过多
- xargs命令的基本用法
- Nacos installation and service registration
猜你喜欢
![[three storage methods of graph] just use adjacency matrix to go out](/img/79/337ee452d12ad477e6b7cb6b359027.png)
[three storage methods of graph] just use adjacency matrix to go out

Advanced Computer Network Review(5)——COPE

Activiti7工作流的使用

Design and implementation of online snack sales system based on b/s (attached: source code paper SQL file)

Kratos战神微服务框架(二)

Redis之持久化实操(Linux版)

Master slave replication of redis

工作流—activiti7环境搭建

发生OOM了,你知道是什么原因吗,又该怎么解决呢?

【深度学习】语义分割:论文阅读:(CVPR 2022) MPViT(CNN+Transformer):用于密集预测的多路径视觉Transformer
随机推荐
Redis geospatial
Servlet learning diary 7 -- servlet forwarding and redirection
英雄联盟轮播图手动轮播
Oom happened. Do you know the reason and how to solve it?
Design and implementation of film and television creation forum based on b/s (attached: source code paper SQL file project deployment tutorial)
Persistence practice of redis (Linux version)
Seven layer network architecture
018.有效的回文
一文读懂,DDD落地数据库设计实战
Advanced Computer Network Review(5)——COPE
[Chongqing Guangdong education] reference materials for nine lectures on the essence of Marxist Philosophy in Wuhan University
The carousel component of ant design calls prev and next methods in TS (typescript) environment
Redis之核心配置
Kratos ares microservice framework (I)
Mapreduce实例(七):单表join
go-redis之初始化連接
小白带你重游Spark生态圈!
发生OOM了,你知道是什么原因吗,又该怎么解决呢?
【shell脚本】——归档文件脚本
Mysql database recovery (using mysqlbinlog command)