当前位置:网站首页>Yarn重启applications记录恢复
Yarn重启applications记录恢复
2022-07-01 13:00:00 【fanxl12】
Yarn重启applications记录恢复
Yarn重启applications记录恢复
修改yarn-core.xml配置文件
ResourceManager重启恢复
将yarn-site.xml中的
yarn.resourcemanager.recovery.enabled配置项设为true(默认是false)<property> <name>yarn.resourcemanager.recovery.enabled</name> <value>true</value> </property>配置
yarn.resourcemanager.store.class参数,该参数用来指定RM在重启之前将自己的状态保存在何种存储媒介上,目前有3种存储可选org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore
默认值,是基于文件系统的存储(本地存储或者HDFS)。可以指定yarn.resourcemanager.fs.state-store.uri作为存储路径,如果指定这个yarn.resourcemanager.fs.state-store.uri必须要设置。org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore
基于ZooKeeper的存储,当启用RM高可用时,只能选择这种方式。因为两个RM都有可能是活跃的(认为自己才是真正的RM),进而发生脑裂。基于ZK的存储可以通过隔离(fence)状态数据防止脑裂。可以指定hadoop.zk.address(ZK节点地址列表)和yarn.resourcemanager.zk-state-store.parent-path(状态数据的根节点路径)参数。org.apache.hadoop.yarn.server.resourcemanager.recovery.LeveldbRMStateStore
基于LevelDB的存储。它比前两种方式都更轻量级,占用的存储空间和I/O要小得多,并且支持更好的原子性操作。对性能有极致要求时采用。可以指定yarn.resourcemanager.leveldb-state-store.path作为存储路径。<property> <name>yarn.resourcemanager.store.class</name> <value>org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore</value> </property>
配置yarn.resourcemanager.fs.state-store.uri,如果yarn.resourcemanager.store.class是org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore需要配置,这里配置HDFS存储
<property> <name>yarn.resourcemanager.fs.state-store.uri</name> <value>hdfs://hadoop-master:9010/rmstore</value> </property>最后配置yarn.resourcemanager.work-preserving-recovery.scheduling-wait-ms,它表示从RM重启后从各个NM同步Container信息的等待时长,在此之后才会分配新的Container。默认值是10000(10秒),一般不需要改动。
<property> <name>yarn.resourcemanager.work-preserving-recovery.scheduling-wait-ms</name> <value>10000</value> </property>
配置NodeManager重启自动恢复
将yarn-site.xml中的
yarn.nodemanager.recovery.enabled配置项设为true(默认是false)<property> <name>yarn.nodemanager.recovery.enabled</name> <value>true</value> </property>配置
yarn.nodemanager.recovery.dir参数,指定NM在重启之前,将Container的状态写入此本地路径。默认值为${hadoop.tmp.dir}/yarn-nm-recovery<property> <name>yarn.nodemanager.recovery.dir</name> <value>/opt/topology/db_data/hadoop-data/yarn-nm-recovery</value> </property>配置
yarn.nodemanager.address参数,该参数为NM的RPC地址,默认为${yarn.nodemanager.hostname}:0,即随机使用临时端口。一定要指定为一个固定端口(如8041),否则NM重启之后会更换端口,就无法恢复Container的状态了<property> <name>yarn.nodemanager.address</name> <value>hadoop-master:45454</value> </property>
边栏推荐
- Different test techniques
- The sky is blue and misty
- PG基础篇--逻辑结构管理(触发器)
- There are still many things to be done in the second half of the year
- logstash报错:Cannot reload pipeline, because the existing pipeline is not reloadable
- 买卖其实也有风险
- 工具箱之 IKVM.NET 项目新进展
- leetcode:241. Design priority for operation expression [DFS + Eval]
- Nc100 converts strings to integers (ATOI)
- 华为HMS Core携手超图为三维GIS注入新动能
猜你喜欢
![[Niu Ke's questions -sql big factory interview real questions] no2 User growth scenario (a certain degree of information flow)](/img/a0/e9e7506c9c34986dc73562539c8410.png)
[Niu Ke's questions -sql big factory interview real questions] no2 User growth scenario (a certain degree of information flow)

Operator-1 first acquaintance with operator

Simple two ball loading

Ikvm of toolbox Net project new progress
![79. Word search [DFS + backtracking visit + traversal starting point]](/img/d6/a7693b2af435b7cf4562161ca4bd3f.png)
79. Word search [DFS + backtracking visit + traversal starting point]

The popular major I chose became "Tiankeng" four years later

Meta enlarge again! VR new model posted on CVPR oral: read and understand voice like a human

Operator-1初识Operator

MySQL statistical bill information (Part 2): data import and query

How can genetic testing help patients fight disease?
随机推荐
声明一个抽象类Vehicle,它包含私有变量numOfWheels和公共函数Vehicle(int)、Horn()、setNumOfWheels(int)和getNumOfWheels()。子类Mot
1553B环境搭建
MySQL报错1040Too many connections的原因以及解决方案
买卖其实也有风险
Has anyone ever encountered this situation? When Oracle logminer is synchronized, the value of CLOB field is lost
Jenkins+webhooks-多分支参数化构建-
leetcode:241. 为运算表达式设计优先级【dfs + eval】
Logstash error: cannot reload pipeline, because the existing pipeline is not reloadable
CS5268优势替代AG9321MCQ Typec多合一扩展坞方案
PG基础篇--逻辑结构管理(触发器)
There are still many things to be done in the second half of the year
What are the solutions for session sharing of highly paid programmers & interview questions series 118?
VM虚拟机配置动态ip和静态ip访问
从数据库中更新一条数据,用cdc会同时获得op字段分别为d和c的两条数据吗?我记得之前是只有op为u
阿霍的三个阶段
logstash报错:Cannot reload pipeline, because the existing pipeline is not reloadable
Asp. NETCORE uses dynamic to simplify database access
科学创业三问:关于时机、痛点与重要决策
Sharing with the best paper winner of CV Summit: how is a good paper refined?
PG basics -- Logical Structure Management (trigger)