当前位置:网站首页>Chapter 6 datanode
Chapter 6 datanode
2022-07-06 16:35:00 【Can't keep the setting sun】
6.1.DataNode Working mechanism
- A data block in DataNode Is stored as a file on disk , There are two files , One is the data itself , One is that the metadata includes the length of the data block , Checksums of data blocks , And time stamps .
- DataNode Starts to NameNode register , After successful registration , Periodically NameNode Report all block information .
- The heartbeat is every 3 Seconds at a time , Heartbeat returns results with NameNode To the DataNode The order of , For example, copy block data to another machine , Or delete a block of data . If exceeded 10 Minutes did not receive a certain DataNode The heart of , The node is considered unavailable .
- You can safely join and exit some machines while the cluster is running .
6.2. Data integrity
- When DataNode Read block When , It will also have a checksum.
- If you calculate the checksum, And block The values are different when created , explain block Has been damaged .
- client Read other DataNode Upper block.
- DataNode Verify periodically after the file is created checksum.
6.3. Drop time parameter Settings
DataNode Process death or network failure DataNode Can't be with NameNode signal communication ,NameNode The node is not immediately judged dead , It's going to take a while , This period of time is temporarily called overtime .HDFS The default timeout is 10 minute +30 second . If the timeout is defined as timeout, The formula for calculating the overtime is :timeout = 2 * dfs.namenode.heartbeat.recheck-interval + 10 * dfs.heartbeat.interval
explain : default dfs.namenode.heartbeat.recheck-interval
The size is 5 minute ,dfs.heartbeat.interval
The default is 3 second . stay hdfs-site.xml
In profile dfs.namenode.heartbeat.recheck-interval In milliseconds ,dfs.heartbeat.interval In seconds .
<property>
<name>dfs.namenode.heartbeat.recheck-interval</name>
<value>300000</value>
</property>
<property>
<name> dfs.heartbeat.interval </name>
<value>3</value>
</property>
6.4. Add new data nodes
demand : As the company's business grows , More and more data , The capacity of the original data nodes can no longer meet the needs of data storage , You need to dynamically add new data nodes on the basis of the original cluster .
Environmental preparation
- Clone a virtual machine (NameNode)
- modify ip Address and host name
- Add new nodes ssh Password free login configuration
- Delete the new node data and logs Files in directory ( Because cloned NameNode host )
Operation steps
- stay namenode Of /opt/module/hadoop-2.7.2/etc/hadoop Create under directory dfs.hosts file
touch dfs.hosts
vi dfs.hosts
Add the following host name ( all DataNode node , Include new nodes hadoop105)
hadoop102
hadoop103
hadoop104
hadoop105
Be careful : among dfs.hosts Lists the connections NameNode The node of , If it is empty , Then all of them DataNode Can be connected to NameNode. If it's not empty , Then... Exists in the file DataNode Can be connected to .dfs.hosts.exclude
List the forbidden connections NameNode The node of . If a node exists at the same time dfs.hosts and dfs.hosts.exclude, It is forbidden to connect to .
- stay namenode Of hdfs-site.xml Added in configuration file dfs.hosts attribute
<property>
<name>dfs.hosts</name>
<value>/opt/module/hadoop-2.7.2/etc/hadoop/dfs.hosts</value>
</property>
- Refresh namenode
hdfs dfsadmin -refreshNodes
The results are shown below
Refresh nodes successful
- Refresh resourcemanager node
yarn rmadmin -refreshNodes
The results are shown below
INFO client.RMProxy: Connecting to ResourceManager at hadoop103/192.168.1.103:8033
- stay NameNode and DataNode Of slaves Add a new host name to the file
hadoop102
hadoop103
hadoop104
hadoop105
- Start the data node and node manager on the new node
sbin/hadoop-daemon.sh start datanode
The results are shown below
starting datanode, logging to /opt/module/hadoop-2.7.2/logs/hadoop-lubin-datanode-hadoop105.out
sbin/yarn-daemon.sh start nodemanager
The results are shown below
starting nodemanager, logging to /opt/module/hadoop-2.7.2/logs/yarn-lubin-nodemanager-hadoop105.out
stay web Check the browser for ok
If the data is unbalanced , Cluster rebalancing can be achieved by command ( stay sbin Execute the following command in the directory )
./start-balancer.sh
As shown below
starting balancer, logging to /opt/module/hadoop-2.7.2/logs/hadoop-lubin-balancer-hadoop102.out
Time Stamp Iteration# Bytes Already Moved Bytes Left To Move Bytes Being Moved
6.5. Retire the old data node
(1) stay namenode $HADOOP_HOME/etc/hadoop
Create under directory dfs.hosts.exclude
file
touch dfs.hosts.exclude
Add the retirement node hostname
hadoop105
(2) stay namenode Of hdfs-site.xml Added in configuration file dfs.hosts.exclude attribute
<property>
<name>dfs.hosts.exclude</name>
<value>/opt/module/hadoop-2.7.2/etc/hadoop/dfs.hosts.exclude</value>
</property>
(3) stay namenode Refresh the node namenode、 stay resourcemanager Refresh the node resourcemanager
$ hdfs dfsadmin -refreshNodes
The following message appears , Indicates that the refresh was successful
Refresh nodes successful
$ yarn rmadmin -refreshNodes
The following message appears , Indicates that the refresh was successful
INFO client.RMProxy: Connecting to ResourceManager at hadoop103/192.168.1.103:8033
(4) Check web browser , The state of the decommissioned node is decommission in progress( The retired ), Indicates that the data node is copying blocks to other nodes .
(5) Wait for the status of the decommissioned node to be decommissioned( All the blocks have been copied ), Stop the node and the node explorer . Be careful : If the copy number is 3, The nodes in service are less than or equal to 3, You can't retire successfully , You need to modify the number of copies before you can retire .
(6) On the retirement node , Stop the node process
sbin/hadoop-daemon.sh stop datanode
stopping datanode
sbin/yarn-daemon.sh stop nodemanager
stopping nodemanager
(7) from namenode Of dfs.hosts Delete retired node from file hadoop105
hadoop102
hadoop103
hadoop104
Refresh namenode, Refresh resourcemanager
$ hdfs dfsadmin -refreshNodes
Refresh nodes successful
$ yarn rmadmin -refreshNodes
INFO client.RMProxy: Connecting to ResourceManager at hadoop103/192.168.1.103:8033
(8) from NameNode and DataNode Of slave Delete retired node from file hadoop105
hadoop102
hadoop103
hadoop104
(9) If the data is unbalanced , Cluster rebalancing can be achieved by command
$ sbin/start-balancer.sh
starting balancer, logging to /opt/module/hadoop-2.7.2/logs/hadoop-lubin-balancer-hadoop102.out
Time Stamp Iteration# Bytes Already Moved Bytes Left To Move Bytes Being Moved
6.6.Datanode Multi-directory configuration
- datanode It can also be configured as multiple directories , Each directory stores different data . namely : Data is not a copy .
- Modify the configuration file
hdfs-site.xml
<property>
<name>dfs.datanode.data.dir</name>
<value>file:///${hadoop.tmp.dir}/dfs/data1,file:///hd2/dfs/data2</value>
</property>
Be careful : The default value is file:///${hadoop.tmp.dir}/dfs/data
, If the server has multiple disks, this parameter must be modified ( Pay attention to the access rights of the attached disk );
The disk condition of each server node is different , So after modifying the configuration , No need to distribute ;
6.7. Data equalization
6.7.1.Hadoop Data balance between nodes
(1) Enable data balancing
sbin/start-balancer.sh -threshold 10
Parameters :10 Indicates that the difference in disk space utilization of each node in the cluster does not exceed 10%, It can be adjusted according to the actual situation
(2) Stop data balancing
sbin/stop-balancer.sh
Be careful : because HDFS You need to start a separate Rebalance Server To execute Rebalance operation , So try not to be in NameNode On the implementation start-balancer.sh.
(2) Hadoop2.X Support data balance between nodes
hdfs balancer -help
Usage: java Balancer
[-policy <policy>] the balancing policy: datanode or blockpool
[-threshold <threshold>] Percentage of disk capacity
[-exclude [-f <hosts-file> | comma-sperated list of hosts]] Excludes the specified datanodes.
[-include [-f <hosts-file> | comma-sperated list of hosts]] Includes only the specified datanodes.
For more efficient implementation balancer operation , Recommendations are as follows :
-threshold 10 Parameter meaning : The target parameter to determine whether the cluster is balanced , every last datanode The difference between the storage utilization and the total storage utilization of the cluster should be less than this threshold , Theoretically , The smaller the parameter is set , The more balanced the whole cluster is , But in an online environment ,hadoop The cluster is in progress balance when , Also write and delete data concurrently , So it may not reach the set balance parameter value .
-include Parameter meaning : Be balanced datanode list
-exclude Parameter meaning : Don't want to balance datanode list
hdfs dfsadmin -setBalancerBandwidth xxx Parameter meaning : Set up balance The bandwidth that the tool can occupy during operation , Setting too large may cause mapred slow .
CDH Balancer Is very simple to use , Just set the above parameters , Click again Actions→Rebalance A menu item , It will automatically start balancing . If it has been running before , Need to stop first , If you can't get up , need kill fall
6.7.2. Data balancing between disks
(1) Turn on disk balancer
stay CDH 5.8.2+ In the version , It can be done by CM Middle configuration . If used Hadoop The version is 3.0+(hadoop2.X Data balancing between disks is not supported ), It's right there hdfs-site.xml Add relevant items in . Set up dfs.disk.balancer.enabled
by true
(2) Generate a balanced plan
hdfs diskbalancer -plan cdh4
Be careful :cdh4 For hosts that need to be balanced
(3) Execute a balanced plan
hdfs diskbalancer -execute {
/system/diskbalancer/XXXXX/{
Host name }.plan.json}
(4) View execution status hdfs diskbalancer -query { Host name }
(5) Completion and inspection
hadoop2.x Cannot automatically balance disks , At the node hdfs-site.xml Several parameters are added to change the data storage strategy to achieve the purpose of balance , Restart this node after adding parameters datanode that will do .
<property>
<name>dfs.datanode.fsdataset.volume.choosing.policy</name>
<value>org.apache.hadoop.hdfs.server.datanode.fsdataset.AvailableSpaceVolumeChoosingPolicy</value>
</property>
<property>
<name>dfs.datanode.available-space-volume-choosing-policy.balanced-space-threshold</name>
<value>10737418240</value>
</property>
<property>
<name>dfs.datanode.available-space-volume-choosing-policy.balanced-space-preference-fraction</name>
<value>0.85f</value>
</property>
Parameter interpretation dfs.datanode.available-space-volume-choosing-policy.balanced-space-threshold
The difference between the capacity of the disk with the largest remaining capacity and that of the smallest disk (10G The default value is )
dfs.datanode.available-space-volume-choosing-policy.balanced-space-preference-fraction
If the size of the current copy is greater than lowAvailableVolumes The maximum free space of all disks inside , Then it will be stored in highAvailableVolumes Inside , In addition, the situation will be 85% The probability is stored in highAvailableVolumes,15% The probability is stored in lowAvailableVolumes.
Be careful :
This is invalid for the data that has been stored on the disk , Only valid for subsequent stored data , Use balance to transfer the stored data .
边栏推荐
- 读取和保存zarr文件
- Acwing: Game 58 of the week
- < li> dot style list style type
- 拉取分支失败,fatal: ‘origin/xxx‘ is not a commit and a branch ‘xxx‘ cannot be created from it
- Sanic异步框架真的这么强吗?实践中找真理
- Codeforces Round #801 (Div. 2)A~C
- 875. Leetcode, a banana lover
- QT implementation window gradually disappears qpropertyanimation+ progress bar
- 图图的学习笔记-进程
- 第5章 NameNode和SecondaryNameNode
猜你喜欢
QT按钮点击切换QLineEdit焦点(含代码)
Codeforces Round #801 (Div. 2)A~C
力扣——第298场周赛
第一章 MapReduce概述
本地可视化工具连接阿里云centOS服务器的redis
Problem - 922D、Robot Vacuum Cleaner - Codeforces
Some problems encountered in installing pytorch in windows11 CONDA
807. Maintain the urban skyline
Hbuilder X格式化快捷键设置
VMware Tools和open-vm-tools的安装与使用:解决虚拟机不全屏和无法传输文件的问题
随机推荐
原生js实现全选和反选的功能 --冯浩的博客
window11 conda安装pytorch过程中遇到的一些问题
第6章 DataNode
Advancedinstaller installation package custom action open file
Market trend report, technological innovation and market forecast of desktop electric tools in China
875. Leetcode, a banana lover
Date plus 1 day
软通乐学-js求字符串中字符串当中那个字符出现的次数多 -冯浩的博客
(lightoj - 1354) IP checking (Analog)
China double brightening film (dbef) market trend report, technical dynamic innovation and market forecast
Educational Codeforces Round 130 (Rated for Div. 2)A~C
使用jq实现全选 反选 和全不选-冯浩的博客
Research Report on market supply and demand and strategy of China's four seasons tent industry
Acwing - game 55 of the week
QT implementation fillet window
Share an example of running dash application in raspberry pie.
Chapter 5 detailed explanation of consumer groups
Research Report on market supply and demand and strategy of China's tetraacetylethylenediamine (TAED) industry
力扣——第298场周赛
Codeforces Round #802(Div. 2)A~D