当前位置:网站首页>Solve the problem of too many small files
Solve the problem of too many small files
2022-07-06 09:34:00 【Prism 7】
List of articles
1. Use hive Self contained concatenate command , Automatically merge small files
Usage method :
2. Adjust the parameters to reduce map Number
In execution map Forward small file merge , stay mapper Combine multiple files into one split As input . adjustment split At least the size of
3. Reduce Reduce The number of
reduce The number of output files depends on the number of output files , So you can adjust reduce Control the number of hive The number of files in the table .
4. HAR file
Use hadoop Of archive File small files , Can package multiple small files into one har file
5. jvm reusing
Hadoop The default configuration is usually to use derivation JVM To execute map and Reduce Mission . At this time JVM The startup process can be quite expensive , Especially for execution job There are hundreds of thousands of them task Mission status .JVM Reuse allows JVM Instance in the same job Reuse in N Time .
The downside of this feature is that , Turn on JVM Reuse will always be used to task slot , For reuse , Not released until the mission is complete .
边栏推荐
- Redis之哨兵模式
- Chapter 1 :Application of Artificial intelligence in Drug Design:Opportunity and Challenges
- Kratos ares microservice framework (I)
- Processes of libuv
- xargs命令的基本用法
- Seven layer network architecture
- Connexion d'initialisation pour go redis
- Persistence practice of redis (Linux version)
- Servlet learning diary 7 -- servlet forwarding and redirection
- 基于B/S的医院管理住院系统的研究与实现(附:源码 论文 sql文件)
猜你喜欢
O & M, let go of monitoring - let go of yourself
运维,放过监控-也放过自己吧
Advanced Computer Network Review(3)——BBR
Mapreduce实例(六):倒排索引
Redis之Bitmap
Mathematical modeling 2004b question (transmission problem)
MapReduce工作机制
Research and implementation of hospital management inpatient system based on b/s (attached: source code paper SQL file)
Detailed explanation of cookies and sessions
leetcode-14. Longest common prefix JS longitudinal scanning method
随机推荐
Global and Chinese market of airport kiosks 2022-2028: Research Report on technology, participants, trends, market size and share
[Yu Yue education] reference materials of power electronics technology of Jiangxi University of science and technology
Redis' performance indicators and monitoring methods
MapReduce工作机制
[Chongqing Guangdong education] reference materials for nine lectures on the essence of Marxist Philosophy in Wuhan University
Global and Chinese market of capacitive displacement sensors 2022-2028: Research Report on technology, participants, trends, market size and share
Advanced Computer Network Review(4)——Congestion Control of MPTCP
One article read, DDD landing database design practice
In order to get an offer, "I believe that hard work will make great achievements
工作流—activiti7环境搭建
Redis cluster
DCDC power ripple test
Five layer network architecture
Mapreduce实例(九):Reduce端join
Global and Chinese market of AVR series microcontrollers 2022-2028: Research Report on technology, participants, trends, market size and share
[shell script] - archive file script
QML control type: menu
为拿 Offer,“闭关修炼,相信努力必成大器
CAP理论
基于B/S的网上零食销售系统的设计与实现(附:源码 论文 Sql文件)