当前位置:网站首页>The whole process of building a fully distributed cluster
The whole process of building a fully distributed cluster
2020-11-09 08:20:00 【osc_ychelles】
Full distributed, full step
Three machines , Respectively hdp01,hdp02,hdp03
The idea is to configure first hdp01, Clone again hdp02,hdp03
One . build hdp01
1. Turn off firewall ( There will be delays in general , Even if you turn off the firewall, you can still view the status , Restart can be seen in the state is OK )
systemctl disable firewalld
Check status
systemctl status firewalld
reboot -h now---- Restart the computer
2. Change ip
vi /etc/sysconfig/network-scripts/ifcfg-ens33
Six places need to be changed
1.BOOTPROTO=static
2.ONBOOT=yes
3.IPADDR=192.168.73.102
4.NETMASK=255.255.255.0
5.GATEWAY=192.168.73.2
6.DNS1=8.8.8.8
DNS2=114.114.114.114
It's over ip Be sure to restart the network , Otherwise it will not take effect
systemctl restart network
Restart the network and check again ip
ip addr
3. Change host name ( The first one doesn't have to be changed )
hostnamectl set-hostname hdp02
You can check , use hostname command
4. Change the mapping file
vi /etc/hosts
5. If you connect to a remote finalshell Software etc. , Change local host file
C:\Windows\System32\drivers\etc\host
6. Write the secret key
ssh-keygen -t rsa
cd .ssh You can see that the public key and private key are generated
If you are datenode, Also must do to oneself do not secret ( As long as slaves All of them are datenode)
ssh-copy-id localhost
7. install jdk and hadoop
Extract to the directory you want to extract
tar -zxvf hadoop-2.7.6.tar.gz -C /usr/local/
You can change your name , The name is too long
mv hadoop-2.7.6/ hadoop
7.1 install ntp ntpdate Source , When you set up a cluster , Time synchronization with
# yum -y install ntp ntpdate
8. Configure environment variables
vi /etc/profile
export JAVA_HOME=/usr/local/jdk
export HADOOP_HOME=/usr/local/hadoop
export PATH=$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH
source /etc/profile( You have to refresh , Otherwise, the environment variable doesn't work )
Check if it is installed as
java -version
hadoop vesion
9. Change profile
vi core-site.xml
<configuration>
<property>
<!-- hdfs Address name of :schame,ip,port Because of my mapping file IP Mapped to the host name -->
<name>fs.defaultFS</name>
<value>hdfs://hdp01:8020</value>
</property>
<property>
<!-- hdfs The basic path , A base path that is dependent on other properties Here is metadata and other information -->
<name>hadoop.tmp.dir</name>
<value>/usr/local/tmp</value>
</property>
</configuration>
vi hdfs-site.xml
<configuration>
<property>
# The one that holds metadata fsimage
<name>dfs.namenode.name.dir</name>
<value>file://${hadoop.tmp.dir}/dfs/name</value>
</property>
<property>
# Storage location of blocks
<name>dfs.datanode.data.dir</name>
<value>file://${hadoop.tmp.dir}/dfs/data</value>
</property>
<property>
# Number of copies
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
# The size of the block
<name>dfs.blocksize</name>
<value>134217728</value>
</property>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>hdp02:50090</value>
</property>
<property>
<name>fs.checkpoint.dir</name>
<value>file:///${hadoop.tmp.dir}/checkpoint/dfs/cname</value>
</property>
<property>
<name>fs.checkpoint.edits.dir</name>
<value>file:///${hadoop.tmp.dir}/checkpoint/dfs/cname</value>
</property>
<property>
<name>dfs.http.address</name>
<value>hdp01:50070</value>
</property>
</configuration>
cp mapred-site.xml.template mapred-site.xml This file does not have
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>hdp01:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>hdp01:19888</value>
</property>
</configuration>
yarn-site.xml
<configuration>
<!-- Site specific YARN configuration properties -->
<property>
<!-- Appoint yarn Of shuffle technology -->
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<!-- Appoint resourcemanager The host name -->
<property>
<name>yarn.resourcemanager.hostname</name>
<value>qianfeng01</value> </property>
<!-- The following options --> <!-- Appoint shuffle The corresponding class -->
<property>
<name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<!-- To configure resourcemanager Internal address of -->
<property>
<name>yarn.resourcemanager.address</name>
<value>qianfeng01:8032</value>
</property>
<!-- To configure resourcemanager Of scheduler Internal address of -->
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>qianfeng01:8030</value>
</property>
<!-- To configure resoucemanager Internal address of resource scheduling for -->
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>qianfeng01:8031</value>
</property>
<!-- To configure resourcemanager The internal address of the administrator -->
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>qianfeng01:8033</value>
</property>
<!-- To configure resourcemanager Of web ui Monitoring page of -->
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>qianfeng01:8088</value>
</property>
</configuration>
hadoop-env.sh
export JAVA_HOME=/usr/local/jdk
vi slaves( Here is datanode)
hdp01
hdp02
hdp03
Two . clone
Try to remember the snapshot
close hdp01, clone hdp02,hdp03
Change ip
Change host name
3. Set up the cluster
1. Time synchronization
Time synchronization detailed configuration
2. format namenode
hdfs namenode -format
3. Start cluster
The startup script -- start-dfs.sh Used to start hdfs The script for the cluster
start-yarn.sh : Used to start yarn Daemon
start-all.sh : Used to start hdfs and yarn
4. test
(1) Build on distributed systems input Folder
hdfs dfs -mkdir /input
(2) Upload the file at will
hdfs dfs -put a.txt /input
(3) test , They sealed us up mapreduce Small feature , Check the number of words
/usr/local/hadoop/share/hadoop/mapreduce( The small function rack bag is all here )
Be sure to look good at the path when you execute , and output We can't build it ourselves , You have to generate , Otherwise, it will report a mistake
hadoop jar hadoop-mapreduce-examples-2.7.6.jar wordcount /input /output
版权声明
本文为[osc_ychelles]所创,转载请带上原文链接,感谢
边栏推荐
- When we talk about data quality, what are we talking about?
- Teacher Liang's small class
- Huawei HCIA notes
- Sublime text3 插件ColorPicker(调色板)不能使用快捷键的解决方法
- Why choose f for the back end of dark website? - darklang
- Why don't we use graphql? - Wundergraph
- 深度优先搜索和广度优先搜索
- 链表
- 自然语言处理(NLP)路线图 - kdnuggets
- LeetCode-15:三数之和
猜你喜欢
几行代码轻松实现跨系统传递 traceId,再也不用担心对不上日志了!
Android emulator error: x86 emulation currently requires hardware acceleration solution
The difference between GDI and OpenGL
Common feature pyramid network FPN and its variants
App crashed inexplicably. At first, it thought it was the case of the name in the header. Finally, it was found that it was the fault of the container!
基于LabVIEW实现的几种滚动字幕
For the first time open CSDN, this article is for the past self and what is happening to you
1. What does the operating system do?
C++之异常捕获和处理
使用递增计数器的线程同步工具 —— 信号量,它的原理是什么样子的?
随机推荐
Linked list
STS安装
B. protocal has 7000eth assets in one week!
LeetCode-15:三数之和
基于链表的有界阻塞队列 —— LinkedBlockingQueue
理论与实践相结合彻底理解CORS
Combine theory with practice to understand CORS thoroughly
Windows环境下如何进行线程Dump分析
Leetcode-15: sum of three numbers
老大问我:“建表为啥还设置个自增 id ?用流水号当主键不正好么?”
This program cannot be started because msvcp120.dll is missing from your computer. Try to install the program to fix the problem
LTM understanding and configuration notes
The vowels in the inverted string of leetcode
API部分的知识点复习
几行代码轻松实现跨系统传递 traceId,再也不用担心对不上日志了!
How to reduce the resource consumption of istio agent through sidecar custom resource
Introduction to nmon
A bunch of code forgot to indent? Shortcut teach you carefree!
如何通过Sidecar自定义资源减少Istio代理资源消耗
第五章编程