当前位置:网站首页>[yarn] yarn container log cleaning
[yarn] yarn container log cleaning
2022-07-06 11:31:00 【kiraraLou】
Preface
Let's tidy up today yarn Container
Log cleaning mechanism .
One 、container Log directory structure
Yarn container
The log directory structure of is shown in the following figure .
NodeManager
The same directory structure will be established for the same application on all directories , And the polling scheduling method is used to allocate these directories to different Container Use . Every Container Three types of logs will be output :
stdout
: Log printed using standard output function , such as Java MediumSystem.out.print
Output content .stderr
: Log information generated by standard error output .syslog
: Use log4j Printed log information , This is the most commonly used way to print logs , By default ,YARN The log is printed in this way , let me put it another way , Usually , Only this file has content , The other two files are empty .
This configuration is yarn.nodemanager.log-dirs
.
Two 、 Log cleaning mechanism
because NodeManager
Will all Container
Save the running log of to the local disk , therefore , Over time , There will be more and more logs . To avoid a lot of Container
journal “ Burst ” disk space ,NodeManager
Log files will be cleaned up periodically , This function consists of components LogHandler
( There are currently two implementations :NonAggregatingLogHandler
and LogAggregationService
) complete .
In total ,NodeManager
Provides regular deletion ( from NonAggregatingLogHandler
Realization ) And log aggregation transfer ( from LogAggregation-Service
Realization ) Two log cleaning mechanisms , By default , The mechanism of regular deletion is adopted .
1. Delete periodically
NodeManager
Allow an application log to remain on disk for yarn.nodemanager.log.retain-seconds
( The unit is seconds , The default is 3×60×60, namely 3 Hours ), Once that time has passed ,NodeManager
All logs of the application will be deleted from the disk .
2. Log aggregation and forwarding
Except for regular deletion ,NodeManager
Another log processing method is also provided —— Log aggregation and forwarding [ illustrations ], Administrators can configure parameters yarn.log-aggregation-enable
Set as true
Enable this feature .
The mechanism will HDFS As a log aggregation warehouse , It uploads the logs generated by the application to HDFS On , For unified management and maintenance . The mechanism consists of two stages : File upload and file lifecycle management .
(1) Upload files
When an application finishes running , All logs generated by it will be uploaded to HDFS Upper ${remoteRootLogDir}/${user}/${suffix}/${appid}
${remoteRootLogDir}
The value is determined by the parameteryarn.nodemanager.remote-app-log-dir
Appoint , The default is/tmp/logs
${user}
For the application owner${suffix}
The value is determined by the parameteryarn.nodemanager.remote-app-log-dir-suffix
Appoint , The default is "logs"${appid}
For applications ID
And all logs in the same node are saved to the same file in the directory , These files are represented by nodes ID name .
The log structure is shown in the following figure .
Once all the logs are uploaded to HDFS after , The log files on the local disk will be deleted . Besides , In order to reduce unnecessary log uploading ,NodeManager
Allow users to specify the log type to upload . There are three types of logs currently supported :
ALL_CONTAINERS
( Upload allContainer
journal )APPLICATION_MASTER_ONLY
( Upload onlyApplicationMaster
Generated log )AM_AND_FAILED_CONTAINERS_ONLY
( UploadApplicationMaster
And failedContainer
Generated log ), By defaultALL_CONTAINERS
.
(2) File lifecycle management
Transfer to HDFS The life cycle of logs on is no longer controlled by NodeManager
be responsible for , But by the JobHistory
Service management . For example, for MapReduce
In terms of computational framework , It's proprietary JobHistory
Be responsible for regular cleaning MapReduce
Transfer the job to HDFS
Log on , The maximum retention time of each log file is yarn.log-aggregation.retain-seconds
( The unit is seconds , The default is 3×60×60, namely 3 Hours ).
Users can view the application log in two ways , One is through NodeManager Of Web Interface ; The other is through Shell Command view .
View all logs generated by an application , The order is as follows :
bin/yarn logs -applicationId application_130332321231_0001
View one Container Generated log , The order is as follows :
bin/yarn logs -applicationId application_130332321231_0001 -containerId container_130332321231_0002 -nodeAddress 127.0.0.1_45454
summary
Yarn Container
There are two mechanisms: local deletion and log aggregation and transfer deletion .Yarn Contaioner
Local logs are created byyarn.nodemanager.log.retain-seconds
control .yarn.log-aggregation-enable
Is to enable log aggregation and transfer .- The log after transferring is saved by
yarn.log-aggregation.retain-seconds
control .
边栏推荐
- Error connecting to MySQL database: 2059 - authentication plugin 'caching_ sha2_ The solution of 'password'
- Introduction to the easy copy module
- Deoldify project problem - omp:error 15:initializing libiomp5md dll,but found libiomp5md. dll already initialized.
- Summary of numpy installation problems
- ES6 let and const commands
- AI benchmark V5 ranking
- 人脸识别 face_recognition
- Heating data in data lake?
- 自动机器学习框架介绍与使用(flaml、h2o)
- 【yarn】CDP集群 Yarn配置capacity调度器批量分配
猜你喜欢
AI benchmark V5 ranking
MySQL与c语言连接(vs2019版)
How to build a new project for keil5mdk (with super detailed drawings)
{一周总结}带你走进js知识的海洋
Deoldify project problem - omp:error 15:initializing libiomp5md dll,but found libiomp5md. dll already initialized.
机器学习--人口普查数据分析
Did you forget to register or load this tag
AcWing 1298. Solution to Cao Chong's pig raising problem
AcWing 242. A simple integer problem (tree array + difference)
软件测试与质量学习笔记3--白盒测试
随机推荐
Connexion sans mot de passe du noeud distribué
Summary of numpy installation problems
MySQL与c语言连接(vs2019版)
Test objects involved in safety test
AcWing 1294. Cherry Blossom explanation
What does usart1 mean
L2-004 is this a binary search tree? (25 points)
Request object and response object analysis
Software testing and quality learning notes 3 -- white box testing
L2-006 树的遍历 (25 分)
SQL时间注入
Codeforces Round #771 (Div. 2)
Solution of deleting path variable by mistake
Integration test practice (1) theoretical basis
Learn winpwn (3) -- sEH from scratch
引入了junit为什么还是用不了@Test注解
QT creator custom build process
neo4j安装教程
天梯赛练习集题解LV1(all)
Solve the problem of installing failed building wheel for pilot