当前位置:网站首页>DataNode Decommision
DataNode Decommision
2022-07-26 22:40:00 【开着拖拉机回家】

目录
3、刷新NameNode、刷新ResourceManager
4、检查Web浏览器,退役节点的状态为decommission in progress(退役中)
Hadoop集群中管理员经常需要向集群中添加节点,或从集群中移除节点,例如:为了扩大存储容量,需要上线一个境界点,相反的,如果想要缩小集群规模,则需要解除节点,如果某些节点出现反常,例如故障率过高或者性能过于低下,则需要下线节点,在上线新节点(而且保证不关闭集群和不损害集群中某一天机器的数据节点数据块丢失情况下),我们需要采用以下方式来解决这些问题
1、配置文件 hdfs-site.xml
在Active Namenode节点,把需要Decommission的DataNode的主机名加入到dfs.hosts.exclude(该配置项在hdfs-site.xml)所指定的文件中,有多个Decommission DataNode以换行分割,建议一次Decommission节点小于hdfs备份数。

2、退役节点名单

3、刷新NameNode、刷新ResourceManager
在Active NameNode节点,执行以下命令:
[[email protected] ~]# hdfs dfsadmin -refreshNodes
[[email protected] ~]# yarn rmadmin -refreshNodes4、检查Web浏览器,退役节点的状态为decommission in progress(退役中)


5、调整参数加速Decommission DataNode
参数名称 | 默认值 | 参数含义 |
|---|---|---|
dfs.namenode.decommission.interval | 30 | 每次启动monitor线程处理退服节点的间隔 |
dfs.namenode.decommission.blocks.per.interval | 500000 | 每个批次最多处理多少个文件块 |
dfs.namenode.decommission.max.concurrent.tracked.nodes | 100 | 同时处理退服的节点个数 |
dfs.namenode.replication.work.multiplier.per.iteration | 32 | 每次复制的块的个数为 dn的个数* 该参数 |
dfs.namenode.replication.max-streams | 64 | 进行复制任务分配时,单个DN 任务的最大值 |
dfs.namenode.replication.max-streams-hard-limit | 128 | 若DN 的复制任务大于改值时,不会将其选为复制的源节点 |
<property>
<name>dfs.namenode.replication.max-streams</name>
<value>2</value>
<description>
Hard limit for the number of highest-priority replication streams.
</description>
</property>
<property>
<name>dfs.namenode.replication.max-streams-hard-limit</name>
<value>4</value>
<description>
Hard limit for all replication streams.
</description>
</property>
<property>
<name>dfs.namenode.replication.work.multiplier.per.iteration</name>
<value>2</value>
<description>
*Note*: Advanced property. Change with caution.
This determines the total amount of block transfers to begin in
parallel at a DN, for replication, when such a command list is being
sent over a DN heartbeat by the NN. The actual number is obtained by
multiplying this multiplier with the total number of live nodes in the
cluster. The result number is the number of blocks to begin transfers
immediately for, per DN heartbeat. This number can be any positive,
non-zero integer.
</description>
</property>dfs.namenode.replication.work.multiplier.per.iteration,默认为2,即每次处理datanode数量*2个block
边栏推荐
猜你喜欢
![[SQL注入] 扩展注入手法](/img/a1/d4218281bfc83a080970d8591cc6d4.png)
[SQL注入] 扩展注入手法

Flink based real-time project: user behavior analysis (I: real-time popular product statistics)

JSCORE day_ 03(7.4)

Spark on yarn's job submission process
![[问题]yum资源被占用怎么办](/img/8d/50129fa1b1ef0aa0e968e6e6f20969.png)
[问题]yum资源被占用怎么办

Doris或StarRocks Jmeter压测

Valueerror: the device should not be 'GPU', since paddepaddle is not compiled with CUDA

BUUCTF-随便注、Exec、EasySQL、Secret File
![[ciscn2019 North China division Day1 web2]ikun](/img/80/53f8253a80a80931ff56f4e684839e.png)
[ciscn2019 North China division Day1 web2]ikun

flinksql 窗口提前触发
随机推荐
flink需求之—ProcessFunction(需求:如果30秒内温度连续上升就报警)
[CTF攻防世界] WEB区 关于备份的题目
Spark on yarn's job submission process
Neo4j基础指南(安装,节点和关系数据导入,数据查询)
Logback custom messageconverter
Detailed explanation of this point in JS
Flink 1.15本地集群部署Standalone模式(独立集群模式)
基于Flink实时计算Demo—关于用户行为的数据分析
基于Flink实时项目:用户行为分析(二:实时流量统计)
JSCORE day_ 05(7.6)
分区的使用及案例
基于Flink实时项目:用户行为分析(一:实时热门商品统计)
MySql - 如何确定一个字段适合构建索引?
JSCORE day_ 01(6.30) RegExp 、 Function
Flink's fault tolerance mechanism (checkpoint)
14 web vulnerability: types of SQL injection and submission injection
深入理解Golang - 闭包
Redisson 工作原理-源码分析
[HITCON 2017]SSRFme
Checked status in El checkbox 2021-08-02